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Preface 


Smart antennas involve processing of signals induced on an array of sensors such as 
antennas, microphones, and hydrophones. They have applications in the areas of radar, 
sonar, medical imaging, and communications. 

Smart antennas have the property of spatial filtering, which makes it possible to receive 
energy from a particular direction while simultaneously blocking it from another direction. 
This property makes smart antennas a very effective tool in detecting and locating an 
underwater source of sound such as a submarine without using active sonar. The capacity 
of smart antennas to direct transmitting energy toward a desired direction makes them 
useful for medical diagnostic purposes. This characteristic also makes them very useful 
in canceling an unwanted jamming signal. In a communications system, an unwanted 
jamming signal is produced by a transmitter in a direction other than the direction of the 
desired signal. For a medical doctor trying to listen to the sound of a pregnant mother's 
heart, the jamming signal is the sound of the baby's heart. 

Processing signals from different sensors involves amplifying each signal before com¬ 
bining them. The amount of gain of each amplifier dictates the properties of the antenna 
array. To obtain the best possible cancellation of unwanted interferences, the gains of these 
amplifiers must be adjusted. How to go about doing this depends on many conditions 
including signal type and overall objectives. For optimal processing, the typical objective 
is maximizing the output signal-to-noise ratio (SNR). For an array with a specified 
response in the direction of the desired signal, this is achieved by minimizing the mean 
output power of the processor subject to specified constraints. In the absence of errors, 
the beam pattern of the optimized array has the desired response in the signal direction 
and reduced response in the directions of unwanted interference. 

The smart antenna field has been a very active area of research for over four decades. 
During this time, many types of processors for smart antennas have been proposed and 
their performance has been studied. Practical use of smart antennas was limited due to 
excessive amounts of processing power required. This limitation has now been overcome 
to some extent due to availability of powerful computers. 

Currently, the use of smart antennas in mobile communications to increase the capacity 
of communication channels has reignited research and development in this very exciting 
field. Practicing engineers now want to learn about this subject in a big way. Thus, there 
is a need for a book that could provide a learning platform. There is also a need for a 
book on smart antennas that could serve as a textbook for senior undergraduate and 
graduate levels, and as a reference book for those who would like to learn quickly about 
a topic in this area but do not have time to perform a journal literature search for the 
purpose. 

This book aims to provide a comprehensive and detailed treatment of various antenna 
array processing schemes, adaptive algorithms to adjust the required weighting on anten¬ 
nas, direction-of-arrival (DOA) estimation methods including performance comparisons, 
diversity-combining methods to combat fading in mobile communications, and effects of 
errors on array system performance and error-reduction schemes. The book brings almost 
all aspects of array signal processing together and presents them in a logical manner. It 
also contains extensive references to probe further. 
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After some introductory material in Chapter 1, the detailed work on smart antennas 
starts in Chapter 2 where various processor structures suitable for narrowband field are 
discussed. Behavior of both element space and beamspace processors is studied when 
their performance is optimized. Optimization using the knowledge of the desired signal 
direction as well as the reference signal is considered. The processors considered include 
conventional beamformer; null-steering beamformer; minimum-variance distortionless 
beamformer, also known as optimal beamformer; generalized side-lobe canceller; and 
postbeamformer interference canceler. Detailed analysis of these processors in the absence 
of errors is carried out by deriving expressions for various performance measures. The 
effect of errors on these processors has been analyzed to show how performance degrades 
because of various errors. Steering vector, weight vector, phase shifter, and quantization 
errors are discussed. 

For various processors, solution of the optimization problem requires knowledge of the 
correlation between various elements of the antenna array. In practice, when this infor¬ 
mation is not available an estimate of the solution is obtained in real-time from received 
signals as these become available. There are many algorithms available in the literature 
to adaptively estimate the solution, with conflicting demands of implementation simplicity 
and speed with which the solution is estimated. Adaptive processing is presented in 
Chapter 3, with details on the sample matrix inversion algorithm, constrained and uncon¬ 
strained least mean squares (LMS) algorithms, recursive LMS algorithm, recursive least 
squares algorithm, constant modulus algorithm, conjugate gradient method, and neural 
network approach. Detailed convergence analysis of many of these algorithms is presented 
under various conditions to show how the estimated solution converges to the optimal 
solution. Transient and steady-state behavior is analyzed by deriving expressions for 
various quantities of interest with a view to teach the underlying analysis tools. Many 
numerical examples are included to demonstrate how these algorithms perform. 

Smart antennas suitable for broadband signals are discussed in Chapter 4. Processing 
of broadband signals may be carried out in the time domain as well as in the frequency 
domain. Both aspects are covered in detail in this chapter. A tapped-delay line structure 
behind each antenna to process the broadband signals in the time domain is described 
along with its frequency response. Various constraints to shape the beam of the broadband 
antennas are derived, optimization for this structure is considered, and a suitable adaptive 
algorithm to estimate the optimal solution is presented. Various realizations of time- 
domain broadband processors are discussed in detail along with the effect that the choice 
of origin has on performance. A detailed treatment of frequency-domain processing of 
broadband signals is presented and its relationship with time-domain processing is estab¬ 
lished. Use of the discrete Fourier transform method to estimate the weights of the time- 
domain structure and how its modular structure could help reduce real-time processing 
are described. 

Correlation between a desired signal and unwanted interference exists in situations of 
multipath signals, deliberate jamming, and so on, and can degrade the performance of an 
antenna array processor. Chapter 5 presents models for correlated fields in narrowband 
and broadband signals. Analytical expressions for SNRs in both narrowband and broad¬ 
band structures of smart antennas are derived, and the effects of several factors on SNR 
are explored, including the magnitude and phase of the correlation, number of elements 
in the array, direction and level of the interference source and the level of the uncorrelated 
noise. Many methods are described to decorrelate the correlated sources, and analytical 
expressions are derived to show the decorrelation effect of the proposed techniques. 

In Chapter 6, various DOA estimation methods are described, followed by performance 
comparisons and sensitivity analyses. These estimation tools include spectral estimation 
methods, minimum variance distortionless response estimator, linear prediction method. 
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maximum entropy method, maximum likelihood method, various eigenstructure methods 
including many versions of MUSIC algorithms, minimum norm methods, CLOSEST 
method, ESPRIT method, and weighted subspace fitting method. This chapter also con¬ 
tains discussion on various preprocessing and number-of-source estimation methods. 

In the first six chapters, it is assumed that the directional signals arrive from point 
sources as plane wave fronts. In mobile communication channels, the received signal is a 
combination of many components arriving from various directions due to multipath 
propagation resulting in large fluctuation in the received signals. This phenomenon is 
called fading. In Chapter 7, a brief review of fading channels is presented, distribution of 
signal amplitude and received power on an antenna is developed, analysis of noise- and 
interference-limited single-antenna systems in Rayleigh and Nakagami fading channels 
is presented by deriving results for average bit error rate and outage probability. The 
results show how fading affects the performance of a single-antenna system. 

Chapter 8 presents a comprehensive analysis of diversity combining, which is a process 
of combining several signals with independent fading statistics to reduce large attenuation 
of the desired signal in the presence of multipath signals. The diversity-combining schemes 
described and analyzed in this chapter include selection combiner, switched diversity 
combiner, equal gain combiner, maximum ratio combiner, optimal combiner, generalized 
selection combiner, cascade diversity combiner, and macroscopic diversity combiner. Both 
noise-limited and interference-limited systems are analyzed in various fading conditions 
by deriving results for average bit error rate and outage probability. 
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Widespread interest in smart antennas has continued for several decades due to their use 
in numerous applications. The first issue of IEEE Transactions of Antennas and Propagation, 
published in 1964 [IEE64], was followed by special issues of various journals [IEE76, IEE85, 
IEE86, IEE87a, IEE87b], books [Hud81, Mon80, Hay85, Wid85, Com88, GodOO], a selected 
bibliography [Mar86], and a vast number of specialized research papers. Some of the 
general papers in which various issues are discussed include [App76, d'A80, d'A84, Gab76, 
Hay92, Kri96, Mai82, Sch77, Sta74, Van88, Wid67], 

The current demand for smart antennas to increase channel capacity in the fast-growing 
area of mobile communications has reignited the research and development efforts in this 
area around the world [God97], This book aims to help researchers and developers by 
providing a comprehensive and detailed treatment of the subject matter. Throughout the 
book, references are provided in which smart antennas have been suggested for mobile 
communication systems. This chapter presents some introductory material and terminol¬ 
ogy associated with antenna arrays for those who are not familiar with antenna theory. 


1.1 Antenna Gain 

Omnidirectional antennas radiate equal amounts of power in all directions. Also known 
as isotropic antennas, they have equal gain in all directions. Directional antennas, on the 
other hand, have more gain in certain directions and less in others. A direction in which 
the gain is maximum is referred to as the antenna boresight. The gain of directional 
antennas in the boresight is more than that of omnidirectional antennas, and is measured 
with respect to the gain of omnidirectional antennas. For example, a gain of 10 dBi (some 
times indicated as dBic or simply dB) means the power radiated by this antenna is 10 dB 
more than that radiated by an isotropic antenna. 
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An antenna may be used to transmit or receive. The gain of an antenna remains the 
same in both the cases. The gain of a receiving antenna indicates the amount of power it 
delivers to the receiver compared to an omnidirectional antenna. 


1.2 Phased Array Antenna 

A phased array antenna uses an array of antennas. Each antenna forming the array is 
known as an element of the array. The signals induced on different elements of an array 
are combined to form a single output of the array. 

This process of combining the signals from different elements is known as beamforming. 
The direction in which the array has maximum response is said to be the beam-pointing 
direction. Thus, this is the direction in which the array has the maximum gain. When signals 
are combined without any gain and phase change, the beam-pointing direction is broadside 
to the linear array, that is, perpendicular to the line joining all elements of the array. 

By adjusting the phase difference among various antennas one is able to control the beam 
pointing direction. The signals induced on various elements after phase adjustment due to 
a source in the beam-pointing direction get added in phase. This results in array gain (or 
equivalently, gain of the combined antenna) equal to the sum of individual antenna gains. 


1.3 Power Pattern 

A plot of the array response as a function of angle is referred to as array pattern or antenna 
pattern. It is also called power pattern when the power response is plotted. It shows the 
power received by the array at its output from a particular direction due to a unit power 
source in that direction. A power pattern of an equispaced linear array of ten elements 
with half-wavelength spacing is shown in Figure 1.1. The angle is measured with respect 
to the line of the array. The beam-pointing direction makes a 90° angle with the line of 
the array. The power pattern has been normalized by dividing the number of elements in 
the array so that the maximum array gain in the beam-pointing direction is unity. 

The power pattern drops to a low value on either side of the beam-pointing direction. 
The place of the low value is normally referred to as a null. Strictly speaking, a null is a 
position where the array response is zero. However, the term sometimes is misused to 
indicate the low value of the pattern. The pattern between the two nulls on either side of 
the beam-pointing direction is known as the main lobe (also called main beam or simply 
beam). The width of the main beam between the two half-power points is called the half¬ 
power beamwidth. A smaller beamwidth results from an array with a larger extent. The 
extent of the array is known as the aperture of the array. Thus, the array aperture is the 
distance between the two farthest elements in the array. For a linear array, the aperture is 
equal to the distance between the elements on either side of the array. 


1.4 Beam Steering 

For a given array the beam may be pointed in different directions by mechanically moving 
the array. This is known as mechanical steering. Beam steering can also be accomplished 
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FIGURE 1.1 

Power pattern of a ten-element linear array with half-wavelength spacing. 

by appropriately delaying the signals before combining. The process is known as electronic 
steering, and no mechanical movement occurs. For narrowband signals, the phase shifters 
are used to change the phase of signals before combining. 

The required delay may also be accomplished by inserting varying lengths of coaxial 
cables between the antenna elements and the combiner. Changing the combinations of 
various lengths of these cables leads to different pointing directions. Switching between 
different combinations of beam-steering networks to point beams in different directions 
is sometimes referred to as beam switching. 

When processing is carried out digitally, the signals from various elements can be 
sampled, stored, and summed after appropriate delays to form beams. The required delay 
is provided by selecting samples from different elements such that the selected samples 
are taken at different times. Each sample is delayed by an integer multiple of the sampling 
interval; thus, a beam can only be pointed in selected directions when using this technique. 


1.5 Degree of Freedom 

The gain and phase applied to signals derived from each element may be thought of as 
a single complex quantity, hereafter referred to as the weighting applied to the signals. If 
there is only one element, no amount of weighting can change the pattern of that antenna. 
However, with two elements, when changing the weighting of one element relative to the 
other, the pattern may be adjusted to the desired value at one place, that is, you can place 
one minima or maxima anywhere in the pattern. Similarly, with three elements, two 
positions may be specified, and so on. Thus, with an L-element array, you can specify L - 1 
positions. These may be one maxima in the direction of the desired signal and L - 2 
minimas (nulls) in the directions of unwanted interferences. This flexibility of an L element 
array to be able to fix the pattern at L - 1 places is known as the degree of freedom of the 
array. For an equally spaced linear array, this is similar to an L - 1 degree polynomial of 
L - 1 adjustable coefficients with the first coefficient having the value of unity. 
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1.6 Optimal Antenna 

An antenna is optimal when the weight of each antenna element is adjusted to achieve 
optimal performance of an array system in some sense. For example, assume that a 
communication system is operating in the presence of unwanted interferences. Further¬ 
more, the desired signal and interferences are operating at the same carrier frequency such 
that these interferences cannot be eliminated by filtering. The optimal performance for a 
communication system in such a situation may be to maximize the signal-to-noise ratio 
(SNR) at the output of the system without causing any signal distortion. This would 
require adjusting the antenna pattern to cancel these interferences with the main beam 
pointed in the signal direction. Thus, the communication system is said to be employing 
an optimal antenna when the gain and the phase of the signal induced on each element 
are adjusted to achieve the maximum output SNR (sometimes also referred to as signal 
to interference and noise ratio, SINR). 


1.7 Adaptive Antenna 

The term adaptive antenna is used for a phased array when the weighting on each element 
is applied in a dynamic fashion. The amount of weighting on each channel is not fixed at 
the time of the array design, but rather decided by the system at the time of processing 
the signals to meet required objectives. In other words, the array pattern adapts to the 
situation and the adaptive process is under control of the system. For example, consider 
the situation of a communication system operating in the presence of a directional inter¬ 
ference operating at the carrier frequency used by the desired signal, and the performance 
measure is to maximize the output SNR. As discussed previously, the output SNR is 
maximized by canceling the directional interference using optimal antennas. The antenna 
pattern in this case has a main beam pointed in the desired signal direction, and has a null 
in the direction of the interference. Assume that the interference is not stationary but moving 
slowly. If optimal performance is to be maintained, the antenna pattern needs to adjust so 
that the null position remains in the moving interference direction. A system using adaptive 
antennas adjusts the weighting on each channel with an aim to achieve such a pattern. 

For adaptive antennas, the conventional antenna pattern concepts of beam width, side 
lobes, and main beams are not used, as the antenna weights are designed to achieve a set 
performance criterion such as maximization of the output SNR. On the other hand, in 
conventional phase-array design these characteristics are specified at the time of design. 


1.8 Smart Antenna 

The term smart antenna incorporates all situations in which a system is using an antenna 
array and the antenna pattern is dynamically adjusted by the system as required. Thus, 
a system employing smart antennas processes signals induced on a sensor array. A block 
diagram of such a system is shown in Figure 1.2. 
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FIGURE 1.2 

Block diagram of an antenna array system. 
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FIGURE 1.3 

Block diagram of a communication system using an antenna array. 

The type of sensors used and the additional information supplied to the processor 
depend on the application. For example, a communication system uses antennas as sensors 
and may use some signal characteristics as additional information. The processor uses 
this information to differentiate the desired signal from unwanted interference. 

A block diagram of a narrowband communication system is shown in Figure 1.3 where 
signals induced on an antenna array are multiplied by adjustable complex weights and 
then combined to form the system output. The processor receives array signals, system 
output, and direction of the desired signal as additional information. The processor cal¬ 
culates the weights to be used for each channel. 


1.9 Book Outline 

Chapter 2 is dedicated to various narrowband processors and their performance. Adaptive 
processing of narrowband signals is discussed in Chapter 3. Descriptions and analyses of 
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broadband-signal processors are presented in Chapter 4. In Chapter 5, situations are 
considered in which the desired signals and unwanted interference are not independent. 
Chapter 6 is focused on using the received signals on an array to identify the direction of 
a radiating source. Chapter 7 and Chapter 8 are focused on fading channels. Chapter 7 
describes such channels and analyzes the performance of a single antenna system in a 
fading environment. Chapter 8 considers multiple antenna systems and presents various 
diversity-combining techniques. 
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Consider the antenna array system consisting of L antenna elements shown in Figure 2.1, 
where signals from each element are multiplied by a complex weight and summed to 
form the array output. The figure does not show components such as preamplifiers, band¬ 
pass filters, and so on. It follows from the figure that an expression for the array output 
is given by 

L 

y(t)=£ w * Xi(t) (2.i) 

1=1 


where * denotes the complex conjugate. The conjugate of complex weights is used to 
simplify the mathematical notation. 

Denoting the weights of the array system using vector notation as 

w = [wj, w 2 ,..., w l ] T (2.2) 

and signals induced on all elements as 

x ( t ) = [ x i( t )' x 2 ( t )--- x L( t )] T ( 2 - 3 ) 

the output of the array system becomes 

y(t)=w H x(t) (2.4) 

where superscript T and H, respectively, denote transposition and the complex conjugate 
transposition of a vector or matrix. Throughout the book w and x(t) are referred to as the 
weight vector and the signal vector, respectively. Note that to obtain the array output, you 
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FIGURE 2.1 

Antenna array system. 


need to multiply the signals induced on all elements with the corresponding weights. In 
vector notation, this operation is carried out by taking the inner product of the weight 
vector with the signal vector as given by (2.4). 

The output power of the array at any time t is given by the magnitude square of the 
array output, that is. 


p ( t ) = |y(t )| 2 
=y(t)y*(t) 


(2.5) 


Substituting for y(t) from (2.4), the output power becomes 


P(t) = w H x(t)x H (t) w (2.6) 

If the components of x(t) can be modeled as zero-mean stationary processes, then for a 
given w the mean output power of the array system is obtained by taking conditional 
expectation over x(t): 

P(w) = E[w H x(t)x H (t)w] 

= w H E[x(t)x H (t)] w (2.7) 

= w H R w 


where E[ ] denotes the expectation operator and R is the array correlation matrix defined by 

R = E[x(t)x H (t)] (2.8) 

Elements of this matrix denote the correlation between various elements. For example, R^ 
denotes the correlation between the ith and the jth element of the array. 

Consider that there is a desired signal source in the presence of unwanted interference 
and random noise. The random noise includes both background and electronic noise. Let 
x s (t), Xj(t), and n(t), respectively, denote the signal vector due to the desired signal source, 
unwanted interference, and random noise. The components of signal, interference, and 
random noise in the output y s (t), \q(t), and y n (t) are then obtained by taking the inner 
product of the weight vector with x s (t), x x (t), and n(t). These are given by 
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y s (t) = W H x s ( t ) 


(2.9) 


y,(t)=w H x I (t) (2.10) 

and 

y n (t) = w H n(t) (2.11) 

Define the array correlation matrices due to the signal source, unwanted interference, 
and random noise, respectively, as 

R s = E[x s (t)x s H (t)] (2.12) 

Ri = E[xj(t) Xj H (t)] (2.13) 

and 

R n = E[n(t)n H (t)] (2.14) 

Note that R is the sum of these three matrices, that is, 

R = R S + R : + R n (2.15) 

Let Pg, P| and P n denote the mean output power due to the signal source, unwanted 
interference, and random noise, respectively. Following (2.7), these are given by 

P s = w H R s w (2.16) 

P, = w H Rj w (2.17) 

and 

P n = W H R n w (2.18) 

Let P N denote the mean power at the output of the array contributed by random noise 
and unwanted interference, that is. 


P N =Pl+P n (2-19) 

We refer to P N as the mean noise power at the output of the array system. Note that the 
noise here includes random noise and contributions from all sources other than the desired 
signal. In some sources, this is also referred to as noise plus interference. 

Substituting from (2.17) and (2.18) in (2.19), 


P N = w H R : w + w H R n w 
= w H (R : + R n ) w 


( 2 . 20 ) 
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Let R n denote the noise array correlation matrix, that is. 


R N =R I + R n (2.21) 

Then P N , the mean noise power at the output of the system can be expressed in terms of 
weight vector and R N as 

P n = w h R n w (2.22) 


Let the output signal-to-noise ratio (SNR), sometimes also referred to as the signal to 
interference plus noise ratio (SINR), be defined as the ratio of the mean output signal 
power to the mean output noise power at the output of the array system, that is, 

SNR = i (2.23) 

Pn 

Substituting from (2.16) and (2.22) in (2.23), it follows that 


SNR = 


w H R s w 
w H R n w 


(2.24) 


The weights of the array system determine system performance. The selection process of 
these weights depends on the application and leads to various types of beamforming schemes. 

In this chapter, various beamforming schemes are discussed, performance of a processor 
using these schemes is analyzed, and the effect of errors on processor performance is 
presented [God93, God97]. 


2.1 Signal Model 

In this section, a signal model is described and expressions for the signal vector and the 
array correlation matrix required for the understanding of various beamforming schemes 
are written. 

Assume that the array is located in the far field of directional sources. Thus, as far as 
the array is concerned, the directional signal incident on the array can be considered as a 
plane wave front. Also assume that the plane wave propagates in a homogeneous media 
and that the array consists of identical distortion-free omnidirectional elements. Thus, for 
the ideal case of nondispersive propagation and distortion free elements, the effect of 
propagation from a source to an element is a pure time delay. 

Let the origin of the coordinate system be taken as the time reference as shown in 
Figure 2.2. Thus, the time taken by a plane wave arriving from the kth source in direction 
(<() k ,0 k ) and measured from the 1th element to the origin is given by 


e k) 


r i-y(<^ 9 k) 

c 


( 2 . 1 . 1 ) 


where r, is the position vector of the 1th element, v (<j)k/6k) is the unit vector in direction 
(9 k ,0k), c is the speed of propagation of the plane wave front, and the dot represents the 
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FIGURE 2.2 

Coordinate system. 


y 



FIGURE 2.3 

Linear array with element spacing d. 

dot product. For a linear array of equispaced elements with element spacing d, aligned 
with the x-axis such that the first element is situated at the origin as shown in Figure 2.3, 
it becomes 


T i( e k) = ^( 1 - 1 ) cos0 k 


( 2 . 1 . 2 ) 
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Note that when the kth source is broadside to the array, 0 k = 90°. It follows from (2.1.2) 
that for this case, = 0 for all 1. Thus, the wave front arrives at all the elements of the 
array at the same time and signals induced on all the elements due to this source are 
identical. For 0 k = 0°, the wave front arrives at the 1th element before it arrives at the 
origin, and the signal induced on the 1th element leads to that induced on an element at 
the origin. The time delay given by (2.1.2) is 


^i( e k ) = f(l-l) (2.1.3) 

On the other hand, for 0 k = 180°, the time delay is given by 

"i( 0 k ) = -f( 1 - 1 ) (2.1.4) 

The negative sign is due to the definition of T,. It is the time taken by the plane wave from 
the 1th element to the origin. The negative sign indicates that the wave front arrives at the 
origin before it arrives at the 1th element, and the signal induced on the 1th element lags 
behind that induced on an element at the origin. 

The signal induced on the reference element (an element at the origin) due to the kth 
source is normally expressed in complex notation as 

m k (t) e' 2 ^ 4 (2.1.5) 

with m k (t) denoting the complex modulating function and f 0 denoting the carrier fre¬ 
quency. The structure of the modulating function reflects the particular modulation used 
in a communication system. For example, for frequency division multiple access (FDMA) 
systems it is a frequency-modulated signal given by m k (t) = A k ek k(t > with A k denoting the 
amplitude and q k (t) denoting the message. For time division multiple access (TDMA) 
systems, it is given by 


m k( t ) = ^ d k( n )p( t - nA ) 

n 


( 2 . 1 . 6 ) 


where p(t) is the sampling pulse, the amplitude d k (n) denotes the message symbol, and A 
is the sampling interval. For code division multiple access (CDMA) systems, m k (t) is given by 

m k( t ) = d k( t )g( t ) ( 2d - 7 ) 

where d k (n) denotes the message sequence and g(t) is a pseudo random-noise binary 
sequence having the values +1 or -1. 

In general, the modulating function is normally modeled as a complex low-pass process 
with zero mean and variance equal to the source power p k as measured at the reference 
element. Assuming that the wave front on the 1th elements arrives T|(<t> k ,0i<) seconds before 
it arrives at the reference element, the signal induced on the 1th element due to the kth 
source can be expressed as 


m k W 


e j2jt/ 0 (t+Ti(^ k ,e k )) 


( 2 . 1 . 8 ) 
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The expression is based upon the narrowband assumption for array signal processing, 
which assumes that the bandwidth of the signal is narrow enough and that the array 
dimensions are small enough for the modulating function to stay almost constant during 
X|( < l ) k ,9iJ seconds, that is, the approximation m k (t) = m k (t + x,((|) k ,0 k )) holds. 

Assume that there are M directional sources present. Let x,(t) denote the total signal 
induced due to all M directional sources and background noise on the 1th element. Thus, 

M 

Xl (t) = ^m k (t) e Wl(t+,l(,k ’ 9k)1 +n 1 (t) (2.1.9) 

k=l 


where n,(t) is random noise component on the 1th element, which includes background 
noise and electronic noise generated in the 1th channel. It is assumed to be temporally 
white with zero mean and variance equal to a 2 . Furthermore, it is assumed to be uncor¬ 
related with directional sources, that is, 

E[m k (t) ni (t)] = 0 (2.1.10) 

The noise on different elements is also assumed to be uncorrelated, that is. 


r .1 f 0 1^ k 

E[n k (tK(t)] = | a2 i=k (2.1.11) 

It should be noted that if the elements were not omnidirectional, then the signal induced 
on each element due to a source is scaled by an amount equal to the response of the 
element under consideration in the direction of the source. 

Substituting from (2.1.9) in (2.3), the signal vector becomes 


ivi 

x(t)=j> k( t) 


k=l 


j2jtT 2 (<|> k ,9 k ) 


J 2 >"l(tk.®k) 


+ n(t) 


( 2 . 1 . 12 ) 


where the carrier term ei 2 *^ has been dropped for the ease of notation as it plays no role 
in subsequent treatment and 


n(t) 


M)) 

n 2 (t) 


n L (t) 


(2.1.13) 


2.1.1 Steering Vector Representation 

Steering vector is an L-dimensional complex vector containing responses of all L elements 
of the array to a narrowband source of unit power. Let S k denote the steering vector 
associated with the kth source. For an array of identical elements, it is defined as 
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(2.1.14) 


T 

S k =[ ex p(j 27tf o T i (4>k/ 0 k))' •••' ex p(j 27tf o\ (4> k / e k)) 


Note that when the first element of the array is at the origin of the coordinate system 
h (Pk/Gk) = 0, the first element of the steering vector is identical to unity. 

As the response of the array varies according to direction, a steering vector is associated 
with each directional source. Uniqueness of this association depends on array geometry 
[God81]. For a linear array of equally spaced elements with element spacing greater than 
half wavelength, the steering vector for every direction is unique. 

For an array of identical elements, each component of this vector has unit magnitude. 
The phase of its ith component is equal to the phase difference between signals induced 
on the ith element and the reference element due to the source associated with the steering 
vector. As each component of this vector denotes the phase delay caused by the spatial 
position of the corresponding element of the array, this vector is also known as the space 
vector. It is also referred to as the array response vector as it measures the response of the 
array due to the source under consideration. In multipath situations such as in mobile 
communications, it also denotes the response of the array to all signals arising from the 
source [Nag94], In this book, steering vector, space vector, and array response vector are 
used interchangeably. 

Using (2.1.14) in (2.1.12), the signal vector can be compactly expressed as 


M 

x(t) = £m k (t)S k+ n(t) 

k=l 


Substituting for x(t) from (2.1.15) in (2.4), it follows that 

y(t) = w H x(t) 

M 

= 52 m k( t ) wHs k + wHn ( t ) 

k=l 


(2.1.15) 


(2.1.16) 


The first term on the right side of (2.1.16) is the contribution from all directional sources 
and the second term is the random noise contribution to the array output. Note that the 
contribution of all directional sources contained in the first term is the weighted sum of 
modulating functions of all sources. The weight applied to each source is the inner product 
of the processor weight vector and steering vector associated with that source, and denotes 
the complex response of the processor toward the source. Thus, the response of a processor 
with weight vector w toward a source in direction ((]),0) is given by 

y(^,e) = w H S((M) (2.1.17) 

An expression for the array correlation matrix is derived in terms of steering vectors. 
Substituting the signal vector x(t) from (2.1.15) in the definition of the array correlation 
matrix given by (2.8) leads to the following expression for the array correlation matrix: 


R = E 


ivi ivi 

^mkWSk+nW J^kWSk+nW 

k=l il k=l 
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+ E[n(t)n H (t)] 


= E 


1V1 1V1 

l m k(t) S k 


(m 1 

-1 



2 M 

H" 

y^kWSk 

n H (t) 

+ E 

n(t) 



[\P ) 

. 



j 



(2.1.18) 


The first term on the right-hand side (RHS) of (2.1.18) simplifies to 


1V1 1V1 

£ m k(t) s k £ m k(t) s 


1V1 

^Em k ( t )m*( t ) 


S S H 

a k a i 


When sources are uncorrelated. 


m i(t) m k (t) 


0 l*k 
P k 1=k 


(2.1.19) 


( 2 . 1 . 20 ) 


where p k denotes the power of the kth source measured at one of the elements of the 
array. It should be noted that p k is the variance of the complex modulating function m k (t) 
when it is modeled as a zero-mean low-pass random process, as mentioned previously. 
Thus, for uncorrelated sources the first term becomes 


1V1 1V1 

52 m k(t) s k X> k( t)s 


J V 


,pk s k s k 


( 2 . 1 . 21 ) 


The fact that the directional sources and the white noise are uncorrelated results in the 
third and fourth terms on the RHS of (2.1.18) to be identical to zero. Using (2.1.11), the 
second term simplifies to o 2 I with I denoting an identity matrix. This along with (2.1.21) 
lead to the following expression for the array correlation matrix when directional sources 
are uncorrelated: 


R = £p k S k S» + a n 2 I 

k=l 


( 2 . 1 . 22 ) 


where I is the identity matrix and a 2 1 denotes the component of the array correlation 
matrix due to random noise, that is 


R n = °n I (2-1.23) 

Let S 0 denote the steering vector associated with the signal source of power p s . Then 
the array correlation matrix due to the signal source is given by 

R s=Ps S o S o I (2.1.24) 
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Similarly, the array correlation matrix due to an interference of power p t is given by 

R, = P| S r S| [ (2.1.25) 

where S t denotes the steering vector associated with the interference. 

Using matrix notation, the correlation matrix R may be expressed in the following 
compact form: 


R = ASA h + o* I 


(2.1.26) 


where columns of the L x M matrix A are made up of steering vectors, that is, 

A = [Sj, S 2 ,..., S M ] (2.1.27) 

and M x M matrix S denote the source correlation. For uncorrelated sources, it is a diagonal 
matrix with 



(2.1.28) 


2.1.2 Eigenvalue Decomposition 

Sometimes it is useful to express the array correlation matrix in terms of its eigenvalues 
and their associated eigenvectors. The eigenvalues of the array correlation matrix can be 
divided into two sets when the environment consists of uncorrelated directional sources 
and uncorrelated white noise. 

The eigenvalues contained in one set are of equal value. Their value does not depend 
on directional sources and is equal to the variance of white noise. The eigenvalues con¬ 
tained in the second set are functions of directional source parameters and their number 
is equal to the number of these sources. Each eigenvalue of this set is associated with a 
directional source and its value changes with the change in the source power of this source. 
The eigenvalues of this set are bigger than those associated with the white noise. Some¬ 
times these eigenvalues are referred to as the signal eigenvalues, and the others belonging 
to the first set are referred to as the noise eigenvalues. Thus, a correlation matrix of an 
array of L elements immersed in M uncorrelated directional sources and white noise has 
M signal eigenvalues and L - M noise eigenvalues. 

Denoting the L eigenvalues of the array correlation matrix in descending order by A,[, 
1 = 1, ..., L and their corresponding unit-norm eigenvectors by U„ 1 = 1, ..., L the matrix 
takes the following form: 


R=QAQ h 


(2.1.29) 


with a diagonal matrix 




A = 


0 


0 


0 

V 


(2.1.30) 
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and 


Q=[u 1 -.u l ] 

(2.1.31) 

This representation is sometimes referred to as the spectral 
correlation matrix. Using the fact that the eigenvectors form 

decomposition of the array 
an orthonormal set. 

QQ H = I 

(2.1.32) 

and 


Q H Q = I 

(2.1.33) 

Thus, 


O 

ii 

o 

(2.1.34) 


The orthonormal property of the eigenvectors leads to the following expression for the 
array correlation matrix: 


R = 


A,,u,u H +o: i 


(2.1.35) 


2.2 Conventional Beamformer 

The conventional beamformer, sometimes also known as the delay-and-sum beamformer, 
has weights of equal magnitudes. The phases are selected to steer the array in a particular 
direction (c) n ,0 o ), known as look direction. With S 0 denoting the steering vector in the look 
direction, the array weights are given by 


w 


C 


1 

L 


( 2 . 2 . 1 ) 


The response of a processor in a direction ((]),0) is obtained by using (2.1.17), that is, 
taking the dot product of the weight vector with the steering vector S((]),0). With the 
weights given by (2.2.1), the response y(c|),0) is given by 


y(<t ) , 9 ) = wf S((|), 0 ) 

= ^S»S( 4 , 0 ) 


( 2 . 2 . 2 ) 


Next, the behavior of this processor is examined under various conditions. It is shown 
that the array with these weights has unity power response in the look direction, that is, 
the mean output power of the processor due to a source in the look direction is the same 
as the source power. An expression for the output SNR is also derived. 
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2.2.1 Source in Look Direction 

Assume a source of power p s in the look direction, hereafter referred to as the signal 
source, with m s (t) denoting its modulating function. The signal induced on the 1th element 
due to this source only is given by 


Xl (t) = m s (t) e’ 2 ^( t+Tl ^»' 9o) ) (2.2.3) 

Thus, in vector notation, using steering vector to denote relevant phases, the array signal 
vector due to look direction signal becomes 


x(t) = m s (t)e^ l S 0 (2.2.4) 

The output of the processor is obtained by taking the inner product of weight vector w c 
with the signal vector x(t) as in (2.4). Thus, the output of the processor is given by 

y(t) = w c H x(t) (2.2.5) 

Substituting from (2.2.1) and (2.2.4), and noting that = L, the output becomes 

y(t) = m s (t)^ (2.2.6) 

Thus, the output of the conventional processor is the same as the signal induced on an 
element positioned at the reference element. Next, look at its mean out power. As there 
is only the signal source present, the mean output power of the processor is the mean 
signal power given by (2.16), that is, 

™ C ) = P S 

= w c H R s w c 

Since 

R s=Ps S o S ? 

substituting from (2.2.1), (2.2.8) in (2.2.7), and noting that SyS), = L, 

p (w) = p s (2.2.9) 

Thus, the mean output power of the conventional processor steered in the look direction 
is equal to the power of the source in the look direction. The process is similar to mechan¬ 
ically steering the array in the look direction except that it is done electronically by 
adjusting the phases. This is also referred to as electronic steering, and phase shifters are 
used to adjust the required phases. It should be noted that the aperture of an electronically 
steered array is different from that of the mechanically steered array. 

The concept of delay-and-sum beamformer can be further understood with the help of 
Figure 2.4, which shows an array with two elements separated by distance d. Assume that 


(2.2.7) 


( 2 . 2 . 8 ) 
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Element 1 



FIGURE 2.4 

Delay-and-sum beamformer. 

a plane wave arriving from direction 0 induces voltage s(t) on the first element. As the 
wave arrives at the second element T seconds later, with 

T = —cos 6 (2.2.10) 

c 

the induced voltage on the second element equals s(t -T). If the signal induced at Element 1 
is delayed by time T, the signal after the delay is s(t-T) and no delay is provided at 
Element 2, then both voltage wave forms are the same. The output of the processor is the 
sum of the two signals s(t -T). A scaling of each wave form by 0.5 provides the gain in 
direction 0 equal to unity. 


2.2.2 Directional Interference 

Let only a directional interference of power p, be present in direction (c|)i,0i). Let rryft) and 
S], respectively, denote the modulating function and the steering vector for the interference. 
The array signal vector for this case becomes 

x(t) = m,(t) e’ 2 ”^ 1 Sj (2.2.11) 

The array output is obtained by taking the inner product of weight vector and the array 
signal vector. Thus, 

y(t) = w c H x(t) 

= m : (t) e i2rfot wfSj (2.2.12) 

= m : (t) e iMot S ^ S ' 

The quantity 1/L determines the amount of interference allowed to filter through 
the processor and thus is the response of the processor in the interference direction. 

The amount of interference power at the output of a processor is given by (2.17). Thus, 
in the presence of interference only, an expression for the mean output power of the 
conventional processor becomes 
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(2.2.13) 


pK)=p r 

= w h R,w 

c 1 c 


For a single source in the nonlook direction 


R i=PiS : Sf 


Substituting for R, and w c in (2.2.13), 


P(w c ) = Pl (1-p) 


where 


P = l- 


S,S" 


(2.2.14) 


(2.2.15) 


(2.2.16) 


and depends on the array geometry and the direction of the interference relative to the 
look direction. 

The effect of the interference direction on parameter p is shown in Figure 2.5 and Figure 2.6 
for two types of arrays, planar and linear. The planar array consists of two rings of five 
elements each, as shown in Figure 2.7, whereas the linear array consists of ten equispaced 
elements. 



FIGURE 2.5 

Parameter p vs. interference direction at three values of inter-ring spacing for the array geometry shown in 
Figure 2.7. From Godara, L.C., ]. Acoust. Soc. Am., 85, 202-213, 1989 [God89a], With permission.) 
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1.5 



FIGURE 2.6 

Parameter p vs. interference direction for a ten-element linear array. (From Godara, L.C., J. Acoust. Soc. Am., 85, 
202-213, 1989 [God89a], With permission.) 



FIGURE 2.7 

Structure of planar array. 


For the planar array, the signal and the interference directions are assumed to be in the 
plane of the array; the signal direction coincides with the x-axis. For the linear array, the 
signal is assumed to be broadside to the array. For both cases, the direction of the inter¬ 
ference is measured relative to the x-axis. 
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Figure 2.5 and Figure 2.6, respectively, show the values of p for various interference 
directions at three values of inter-ring spacing g and three values of inter-element spacing 
d. The parameters g and d are expressed in terms of the wavelength of the narrowband 
sources. These figures show how p depends on the array geometry for given interference 
and signal directions. 


2.2.3 Random Noise Environment 

Consider an environment consisting of uncorrelated noise of power o^. It is assumed that 
there is no directional source present. The array signal vector for this case becomes 

x(t) = n(t) (2.2.17) 

The array output is obtained by taking the inner product of weight vector and the array 
signal vector. Thus, 


y(t)=wfx(t) 

Substituting from (2.2.17) and (2.2.1), the output becomes 


(2.2.18) 


y(t)= 


sX 1 ) 

L 


(2.2.19) 


The mean output noise power of a processor is given by (2.18). Thus, the mean output 
power of the conventional processor in the presence of uncorrelated noise only is given by 


P(w c ) = P 


= w H R w 

c n 


c 


( 2 . 2 . 20 ) 


Since R n is given by 


R n = a n I (2-2.21) 

substituting for R n and w c in (2.2.20), 

P(w c ) = ^ (2.2.22) 

Thus, the mean power at the output of the conventional processor is equal to the mean 
uncorrelated noise power at an element of the array divided by the number of elements 
in the array. In other words, the noise power at the array output is L times less than that 
present on each element. 


2.2.4 Signal-to-Noise Ratio 

Assume that the noise environment consists of the random noise of power o n 2 and a 
directional interference of power p t in the nonlook direction. Assume that there is a source 
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of power p s in the look direction. Given that the interference and the signal are uncorre¬ 
lated, the array signal vector for this case becomes 

x(t) = m s (t) e’ 2 ^* S 0 + m : (t) e ]2n,ot S : + n(t) (2.2.23) 

Now we have two directional sources, a signal source, a directional interference, and 
the random noise. Thus, it follows from (2.1.22) that the array correlation matrix R is given 
by 


R = p s S 0 S« + p I S I Sf + a 2 I (2.2.24) 

The mean output power of the processor is given by 

P(w c ) = wfRw c (2.2.25) 

Substituting from (2.2.1), (2.2.24) and noting that S 1 ,] = L, the expression for the mean 

output power from (2.2.25) becomes 


PK) = P s + Pi (1-P) + ^ (2-2.26) 

Note that the mean output power of the processor is the sum of the mean output powers 
due to signal source, directional interference, and uncorrelated noise. 

The mean signal power at the output of the processor is equal to the mean power of 
the signal source, that is. 


P s =p s (2.2.27) 

The mean noise power is the sum of the interference power and the uncorrelated noise 
power, that is. 


P N =P,(1-P) + ^ (2.2.28) 

The output signal to noise ratio is then given by 

SNR = 

P N 

p s (2.2.29) 

Pi (l-p) + £ 


Now consider a special case when no directional interference is present. For this case, 
the expression for the output SNR becomes 


SNR = 



(2.2.30) 
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As the input SNR is p s /o„, this provides an array gain, which is defined as the ratio of 
the output SNR to the input SNR, equal to L, the number of elements in the array. 

This processor provides maximum output SNR when no directional interference oper¬ 
ating at the same frequency is present. It is not effective in the presence of directional 
interference, whether intentional or unintentional. The response of the processor toward 
a directional source is given by (2.2.2). The performance of the processor in the presence 
of one nonlook directional source indicated by SNR is given by (2.2.29). It is a function of 
the interference power and the parameter p that in turn depends on the relative direction 
of two sources and array geometry. 


2.3 Null Steering Beamformer 

The null steering beamformer is used to cancel a plane wave arriving from a known 
direction and thus produces a null in the response pattern of the plane wave's direction 
of arrival. One of the earliest schemes, referred to as DICANNE [And69, And69a], achieves 
this by estimating the signal arriving from a known direction by steering a conventional 
beam in the direction of the source and then subtracting the output of this from each 
element. An estimate of the signal is made by delay-and-sum beamforming using shift 
registers to provide the required delay at each element, such that the signal arriving from 
the beam-steering direction appears in phase after the delay, and then sums these wave 
forms with equal weighting. This signal then is subtracted from each element after the 
delay. The process is very effective for canceling strong interference and could be repeated 
for multiple interference cancelation. 

Although the process of subtracting the estimated interference using the delay-and-sum 
beamformer in the DICANNE scheme is easy to implement for single interference, it 
becomes cumbersome as the number of interferences grows. A beam with unity response 
in the desired direction and nulls in interference directions may be formed by estimating 
beamformer weights shown in Figure 2.1 using suitable constraints [d'As84, And69a]. 
Assume that S 0 is the steering vector in the direction where unity response is required 
and that Sj, ..., S k are k steering vectors associated with k directions where nulls are 
required. The desired weight vector is the solution of the following simultaneous equations: 

w H S 0 = 1 (2.3.1) 

w H S ; =0, i = l,..., k (2.3.2) 

Using matrix notation, this becomes 

w H A = e[ (2.3.3) 

where A is a matrix with columns being the steering vectors associated with all directional 
sources including the look direction, that is, 

A A [S 0 , Sj,..., S k ] (2.3.4) 
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and ej is a vector of all zeros except the first element which is one, that is. 


ej = [l, 0,..., 0] T (2.3.5) 

For k = L - 1, A is a square matrix. Assuming that the inverse of A exists, which requires 
that all steering vectors are linearly independent [God81], the solution for the weight 
vector is given by 


w H = ej A- 1 (2.3.6) 

In case the steering vectors are not linearly independent, A is not invertible and its pseudo 
inverse can be used in its place. 

It follows from (2.3.6) that due to the structure of the vector e p the first row of the 
inverse of matrix A forms the weight vector. Thus, the weights selected as the first row 
of the inverse of matrix A have the desired properties of unity response in the look 
direction and nulls in the interference directions. 

When the number of required nulls are less than L - 1, A is not a square matrix. A 
suitable estimate of weights may be produced using 

w H = e] A h (AA h ) _1 (2.3.7) 

Although the beam pattern produced by this beamformer has nulls in the interference 
directions, it is not designed to minimize the uncorrelated noise at the array output. It is 
possible to achieve this by selecting weights that minimize the mean output power subject 
to above constraints [Bre88]. 

An application of a null steering scheme for detecting amplitude-modulated signals by 
placing nulls in the known interference directions is described in [Cho93], which is able 
to cancel a strong jammer in a mobile communication system. The use of a null steering 
scheme for a transmitting array employed at a base station is discussed in [Chi94], which 
minimizes the interferences toward other co-channel mobiles. Performance analysis of a 
null steering algorithm is presented in [Fri89]. 


2.4 Optimal Beamformer 

The null steering scheme described in the previous section requires knowledge of the 
directions of interference sources, and the beamformer using the weights estimated by 
this scheme does not maximize the output SNR. The optimal beamforming method 
described in this section overcomes these limitations and maximizes the output SNR in 
the absence of errors. It should be noted that the optimal beamformer, also known as the 
minimum variance distortionless response (MVDR) beamformer, described in this section 
does not require knowledge of directions and power levels of interferences as well as the 
level of the background noise power to maximize the output SNR. It only requires the 
direction of the desired signal. 

In this section, first we discuss an optimal beamformer with its weights without any 
constraints, and then study its performance in the presence of one interference and uncor¬ 
related noise [God86]. 
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2.4.1 Unconstrained Beamformer 

Let an L-dimensional complex vector w represent the weights of the beamformer shown 
in Figure 2.1 that maximize the output SNR. For an array that is not constrained, an 
expression for w is given by [App76, Ree74, Bre73]: 

w = b 0 R N S o (2-4.i) 

where R N is the array correlation matrix of the noise alone, that is, it does not contain any 
signal arriving from the look direction ((j)(>0o) and |t 0 is a constant. 

Consider that the noise environment consists of the random noise of power o n 2 and a 
directional interference of power p, in nonlook direction. Assume that there is a source of 
power p s in the look direction, and that the interference and the signal are uncorrelated. 
For this case, the array correlation matrix R is given by 

R = p s S 0 S« + p I S I Sf + o n 2 I (2.4.2) 

The mean output power of the processor is given by 

P = w h Rw (2.4.3) 


It follows from (2.4.2) and (2.4.3) that 

P = p s w H S 0 S^w + pj w H S,S[ 1 w + o 2 w H w (2.4.4) 

Three terms on the RHS of (2.4.4) correspond to the output signal power, residual 
interference power, and output uncorrelated noise power of the unconstrained optimal 
beamformer. Let these be denoted by Pg, P, and P n , respectively. Thus, it follows that 


and 


P s=Ps WHs o 

(2.4.5) 

Pi =Pj w h Sj S^w 

(2.4.6) 

P = al w H w 

n n 

(2.4.7) 


Substituting for w and noting that S r ( J ^ = L, these equations become 


and 


P s=Ps Po ( S o^ R n S 0 ) 

(2.4.8) 

P I = Po^o R n R i R nS 0 

(2.4.9) 

R n=^nPP0 (ft'Sof 

(2.4.10) 
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where Rj is the correlation matrix of interference and 


P 


S o ,R n' R n S o 

(SoRn S 0 ) 2 


The total noise at the output is given by 

P =P +P 

Substituting from (2.4.9) and (2.4.10), total noise becomes 


R n = bo (sX R, RnS 0 + ° 2 SX'R^) 
= boSX 1 ( R 1 + 0 n I ) R N 1 So 


= b 0 S 0 R N R N R N S 0 

= boSX S o 


(2.4.11) 


(2.4.12) 


(2.4.13) 


2.4.2 Constrained Beamformer 

Let the array weights be constrained to have a unit response in the look direction, that is. 


w H S 0 =l 


(2.4.14) 


Thus, it follows from (2.4.1) that constant ( 0 . 0 is given by 

1 


cH R -1 C 

3 0 1V N 3 0 


(2.4.15) 

Substituting this in (2.4.1) results in the following expression for the weight vector 


w = 


R _1 S 
S H R* 1 S 

1V N ‘-’o 


(2.4.16) 


Substituting for p 0 in (2.4.8), (2.4.9), (2.4.10) and (2.4.13) results in the following expres¬ 
sions for the output signal power, residual interference power, output uncorrelated noise 
power, and the total noise power of the constrained beamformer 


Ps “Ps 


(2.4.17) 



Sq'Rn'R, RnSq 

K R n So)" 


(2.4.18) 
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and 


P„=°„ 2 P 


(2.4.19) 


P KI = 


S H R _1 S 

1V N ‘-’O 


(2.4.20) 


Note from (2.4.19) that p is the ratio of the uncorrelated noise power at the output of the 
constrained beamformer to the uncorrelated noise power at its input. 

As the weights for the optimal beamformer discussed above are computed using noise 
alone matrix inverse (NAME), the processor with these weights is referred to as the NAME 
processor [Cox73]. It is also known as the maximum likelihood (ML) filter [Cap69], as it 
finds the ML estimate of the power of the signal source, assuming all sources as interfer¬ 
ence. It should be noted R N may be not be invertible when the background noise is very 
small. In that case, it becomes a rank deficient matrix. 

In practice when the estimate of the noise alone matrix is not available, the total array 
correlation matrix (signal plus noise) is used to estimate the weights and the processor is 
referred to as the SPNMI (signal-plus-noise matrix inverse) processor. An expression for 
the weights of the constrained processor for this case is given by 


w = 


R 1 s (l 

So' R- 1 S 0 


These weights are the solution of the following optimization problem: 


(2.4.21) 


minimize w H R w 

w (2.4.22) 

subject to w H S 0 = 1 

Thus, the processor weights are selected by minimizing the mean output power of the 
processor while maintaining unity response in the look direction. The constraint ensures 
that the signal passes through the processor undistorted. Therefore, the output signal 
power is the same as the look direction source power. The minimization process then 
minimizes the total noise including interference and the uncorrelated noise. The minimi¬ 
zation of the total output noise while keeping the output signal constant is the same as 
maximizing the output SNR. 

It should be noted that the weights of the NAMI processor and the SPNAMI processor 
are identical; and in the absence of errors, the processor performs identically in both cases. 
This fact can be proved as follows. The Matrix Inversion Lemma for an invertible matrix 
A and a vector x states that 


Since 


(A + xx H )" 1 = A- 1 


A -1 x x h A -1 
1 + x h A _1 x 


(2.4.23) 


R =Ps s 0 s«+R n 


(2.4.24) 
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it follows from the Matrix Inversion Lemma that 


R -1 = R 


-l 

N 


Ps R n S o S o' R n' 
1 + S 0 R n S oPs 


(2.4.25) 


Hence 


R _1 S -R _1 S - 

J'. n 3 0 


Ps R nS 0 So 
1 + S " R n S oPs 


R nSq(1 + S 0 r n S 0 Ps) Ps R n S 0 S 0 R n S 0 
1 + S 0 R n S oPs 

r ~nSq 

1+S 0 r n s oPs 


(2.4.26) 


and 


s^'R-'So = 


S o R m S o 

1 + SKSoPs 


Equations (2.4.21), (2.4.26), and (2.4.27) im P ly 


w = 


R _1 S 
1k n a o 

S H R _1 S 


Thus, 


IV = w 


(2.4.27) 


(2.4.28) 


(2.4.29) 


and the o P timal weights of the two processors are identical. The processor with these 
weights is referred to as the optimal processor. This is also known as MVDR beamformer. 


2.4.3 Output Signal-to-Noise Ratio and Array Gain 

The mean output power of the optimal processor is given by 


P = w h Rw 

1 

~ S“R _1 S 0 


(2.4.30) 


This power consists of the signal power, residual interference power, and uncorrelated 
noise power. Expressions for these quantities are given by (2.4.17), (2.4.18), and (2.4.19), 
respectively. The total noise at the output is the sum of residual interference and uncor¬ 
related noise. The expression for total noise power is given by (2.4.20). 
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Let a denote the SNR of the optimal beamformer, that is. 



(2.4.31) 


It follows from (2.4.17) and (2.4.20) that 


& = p s S»R- 1 S 0 (2.4.32) 

It should be noted that the same result also follows from (2.4.8) and (2.4.13), the expres¬ 
sions for the signal power and the total noise power at the output of unconstrained 
beamformer. Thus, the constrained as well as unconstrained beamformer results in the 
same output SNR. 

The array gain of a beamformer is defined as the ratio of the output SNR to the input 
SNR. Let G denote the array gain of the optimal beamformer, that is. 


G = 


& 

Input SNR 


(2.4.33) 


Let p N denote the total noise at the input. SNR at the input of the beamformer is then 
given by 


Input SNR = ^ 
Pn 


(2.4.34) 


It follows from (2.4.32), (2.4.33) and (2.4.34) that 

G = Pn Rn 

__ Pn_ 

Pn 


(2.4.35) 


2.4.4 Special Case 1: Uncorrelated Noise Only 

For a special case of the noise environment when no direction interference is present, the 
noise-only array correlation matrix is given by 


R n = ®„i 


(2.4.36) 


Substituting the matrix in (2.4.16), a simple calculation yields 


w = 


So 

L 


(2.4.37) 


Thus, the weights of the optimal processor in the absence of errors are the same as those 
of the conventional processor, implying that the conventional processor is the optimal 
processor for this case. Thus, in the absence of directional interferences the conventional 
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processor yields the maximum output SNR and the array gain. The output SNR a and 
the array gain G of the optimal processor for this case are, respectively, given by 


and 



G = L 


(2.4.38) 


(2.4.39) 


These quantities are independent of array geometry and depend only on the number of 
elements in the array. 


2.4.5 Special Case 2: One Directional Interference 

Consider the case of a noise environment consisting of a directional interference of power 
p, and uncorrelated noise of power a n 2 on each element of the array. Let S t denote the 
steering vector in the direction of interference. For this case, the noise-only array correla¬ 
tion matrix is given by 

R N =o n 2 I + p I S I S« (2.4.40) 

Using the Matrix Inversion Lemma, this yields 


R 


-l 

N 



SjST 

Pi 


(2.4.41) 


The substitution for R^-, rearrangement, and algebraic manipulation leads to the fol¬ 
lowing expression for the output SNR: 


The array gain is given by 


where 



P + 


1 + 


PjL 

Pi L 




( ^ 


1+ C * E - 

p + —y- 

PI L 

l Pi J 

l Pl L J 


1 + ^l 

Pi L 


cHc cHc 
n _ i J o a rpi 

T 2 


(2.4.42) 


(2.4.43) 


(2.4.44) 


is a scalar quantity and depends on the direction of the interference relative to the signal 
source and the array geometry, as discussed previously. It follows from (2.2.1) and (2.4.44) 
after rearrangement that 
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p = l-wfS I S[ 1 w ( 


(2.4.45) 


Thus, this parameter is characterized by the weights of the conventional processor. As 
this parameter characterizes the performance of the optimal processor, it implies that the 
performance of the optimal processor in terms of its interference cancelation capability 
depends to a certain extent on the response of the conventional processor to interference. 
This fact has been further highlighted in [Gup82, Gup84]. 

An interesting special case is when the interference is much stronger compared to 
background noise, p : S> o^.For this case, these expressions may be approximated as 


and 




(2.4.46) 


(2.4.47) 


When interference is away from the main lobe of the conventional processor p ~ 1, it 
follows that the output SNR of the optimal processor in the presence of a strong interfer¬ 
ence is the same as that of the conventional processor in the absence of interference. This 
implies that the processor has almost completely canceled the interference, yielding a very 
large array gain. 

The performance of the processor in terms of its output SNR and the array gain is not 
affected by the look direction constraint, as it only scales the weights. Therefore, the 
treatment presented above is valid for the unconstrained processor. 

For the optimal beamformer to operate as described above and to maximize the SNR 
by canceling interferences, the number of interferences must be less than or equal to L - 2, 
as an array with L elements has L -1 degrees of freedom and one has been utilized by 
the constraint in the look direction. This may not be true in a mobile communications 
environment due to the existence of multipath arrivals, and the array beamformer may 
not be able to achieve the maximization of the output SNR by suppressing every inter¬ 
ference. However, as argued in [Win84], the beamformer does not have to suppress 
interferences to a great extent and cause a vast increase in the output SNR to improve the 
performance of a mobile radio system. An increase of a few decibels in the output SNR 
can make possible a large increase in the system's channel capacity. 

In the mobile communication literature, the optimal beamformer is often referred to as 
the optimal combiner. Discussion on the use of the optimal combiner to cancel interferences 
and to improve the performance of mobile communication systems can be found in 
[Win84, Win87, Sua93, Nag94a]. The optimal combiner is discussed in detail in a later 
chapter. 

In the next section, a processor is described that requires a reference signal instead of 
the desired signal direction to estimate the optimal weights of the beamformer. 


2.5 Optimization Using Reference Signal 

A narrowband beamforming structure that employs a reference signal [App76, Wid67, 
Wid75, Zah73, App76a, Wid82] to estimate the weights of the beamformer is shown in 
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FIGURE 2.8 

An array system using reference signal. 


Figure 2.8. The array output is subtracted from an available reference signal r(t) to generate 
an error signal e(t) = r(t) - w H x(t) that is used to control the weights. Weights are adjusted 
such that the mean squared error between the array output and the reference signal is 
minimized. The mean squared error q(w) for a given w is given by 

5(w) = E |e(t)| 2 
= E e(t)e(tf 

= E {r(t) - w H x(t)}{r(t) — w H x(t)|* (2.5.1) 

= E r(t)r(t) + w H x(t)x H (t)w- w H x(t)r(t) -r(t)x H (t)w 
= E |r(t)j 2 + w H Rw-w H z-z H w 

where 

z = E x(t)r(t) (2.5.2) 

is the correlation between the reference signal and the array signals vector x(t). 

The mean square error (MSE) surface is a quadratic function of w and is minimized by 
setting its gradient with respect to w equal to zero, with its solution yielding the optimal 
weight vector, that is. 


© 2004 by CRC Press LLC 









(2.5.3) 


d£(w) 


w 


w=w MSE 


The gradient of MSE with respect to w is obtained by differentiating both sides of (2.5.1) 
with respect to w, yielding 

— = 2Rw - 2z (2.5.4) 

w 




Substituting (2.5.4) in (2.5.3) and solving, you obtain the well-known Wiener-Hoff 
equation for optimal weights: 


W MSE R 


(2.5.5) 


The processor with these weights is also known as the Wiener filter. The minimum MSE £ 
of the processor using these weights is obtained by substituting w MS |. for w in (2.5.1), 
resulting in 


^ = E|r(t)f 


— z h R -1 z 


(2.5.6) 


This scheme may be employed to acquire a weak signal in the presence of a strong 
jammer as discussed in [Zah73] by setting the reference signal to zero and initializing the 
weights to provide an omnidirectional pattern. The process starts to cancel strong inter¬ 
ferences first and the weak signal later. Thus, intuitively, a time is expected when the 
output would consist of the signal, which has not been canceled too much, but strong 
interference has been reduced. 

When an adaptive scheme (discussed in Chapter 3) is used to estimate w vlSK , the strong 
jammer gets canceled first as the weights are adjusted to put a null in that direction to 
leave the signal-to-jammer ratio sufficient for acquisition. 

Arrays using a reference signal equal to zero to adjust weights are referred to as power- 
inversion adaptive arrays [Com79]. The MSE minimization scheme (the Wiener filter) is 
a closed-loop method compared to the open-loop scheme of MVDR (the ML filter) 
described in the previous section. In general the Wiener filter provides higher-output SNR 
compared to the ML filter in the presence of a weak signal source. As the input signal 
power becomes large compared to the background noise, the two processors give almost 
the same results [Gri67]. This result is supported by a simulation study using mobile 
communications with two vehicles [Fli94]. The increased SNR by the Wiener filter is 
achieved at the cost of signal distortion caused by the filter. It should be noted that the 
optimal beamformer does not distort the signal. 

The required reference signal for the Wiener filter may be generated in a number of 
ways depending on the application. In digital mobile communications, a synchronization 
signal may be used for initial weight estimation followed by the use of a detected signal 
as a reference signal. In systems using the TDMA scheme, a user-specific sequence may 
be part of every frame for this purpose [Win94]. The use of known symbols in every frame 
has also been suggested in [Cho92]. In other situations, use of an antenna for this purpose 
has been examined to show the suitability to provide a reference signal [Cho92]. 

Studies of mobile communication systems using reference signal to estimate array 
weights have also been reported in [And91, Geb95, Dio93]. 
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FIGURE 2.9 

Beam-space processor structure. 


2.6 Beam Space Processing 

In contrast to the element space processing discussed in previous sections, where signals 
derived from each element are weighted and summed to produce the array output, the 
beam space processing is a two-stage scheme where the first stage takes the array signals 
as input and produces a set of multiple outputs, which are then weighted and combined 
to produce the array output. These multiple outputs may be thought of as the output of 
multiple beams. The processing done at the first stage is by fixed weighting of the array 
signals and amounts to produce multiple beams steered in different directions. The 
weighted sum of these beams is produced to obtain the array output and the weights 
applied to different beam outputs are then optimized to meet a specific optimization 
criterion. 

In general, for an L-element array, a beam space processor consists of a main beam 
steered in the signal direction and a set of not more than L - 1 secondary beams. The 
weighted output of the secondary beams is subtracted from the main beam. The weights 
are adjusted to produce an estimate of the interference present in the main beam. The 
subtraction process then removes this interference. The secondary beams, also known as 
auxiliary beams, are designed such that they do not contain the desired signal from the 
look direction, to avoid signal cancelation in the subtraction process. A general structure 
of such a processor is shown in Figure 2.9. Beam space processors have been studied under 
many different names including the Howells-Applebaum array [App76, App76a, How76]; 
generalized side-lobe canceler (GSC) [Gri82, Gri77]; partitioned processor [Jim77, Can82]; 
partially adaptive arrays [Van87, Van89, Van90, Qia94, Qia95, Cha76, Mor78]; post-beam- 
former interference canceler [Can84, God86a, God89, God89a, God91]; adaptive-adaptive 
arrays [Bro86]; and multiple-beam antennas [May78, Kle75, Gob76]. 
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The pattern of the main beam is normally referred to as the quiescent pattern and is 
chosen such that it has a desired shape. For a linear array of equispaced elements with 
equal weighting, the quiescent pattern has the shape of sin Lx/sin x with L being the 
number of elements in the array, whereas for Tschebysheff weighting (the weighting 
dependent on Tschebysheff polynomial coefficients), the pattern has equal side-lobe levels 
[Dol46]. The beam pattern of the main beam may be adjusted by applying various forms 
of constraints on the weights [App76a] and using various pattern synthesis techniques 
discussed in [Gri87, Tse92, Web90, Er93, Sim83, Ng02], 

There are many schemes to generate the outputs of auxiliary beams such that no signal 
from the look direction is contained in them, that is, these beams have nulls in the look 
direction. In its simplest form, it can be achieved by subtracting the array signals from 
presteered adjacent pairs [Gab76, Dav67]. It relies on the fact that the component of the 
array signals induced from a source in the look direction is identical after the presteering, 
and this gets canceled in the subtraction process from the adjacent pairs. The process can 
be generalized to produce M - 1 beams from an L-element array signal x(t) using a matrix 
B such that 


q(t) = B H x(t) (2.6.1) 

where M - 1 dimensional vector q(t) denotes the outputs of M - 1 beams and the matrix B, 
referred to as the blocking matrix or the matrix prefilter, has the property that its M - 1 
columns are linearly independent and the sum of elements of each column equals zero, 
implying that M - 1 beams are independent and have nulls in the look direction. For an 
array that is not presteered, the matrix needs to satisfy 

B H S 0 = 0 (2.6.2) 

where S 0 is the steering vector associated with the look direction and 0 denotes a vector 
of zeros. 

It is assumed in the above discussion that M < L, implying that the number of beams 
are less than or equal to the number of elements in the array. When the number of beams 
is equal to the number of elements in the array, the processing in the beam space has not 
reduced the degree of freedom of the array, that is, its null-forming capability has not been 
reduced. In this sense, these arrays are fully adaptive and have the same capabilities as 
that of the array using element space processing. In fact, in the absence of errors, both 
processing schemes produce identical results. On the other hand, when the number of 
beams is less than the number of elements, the arrays are referred to as partially adaptive. 
The null steering capabilities of these arrays have been reduced to equal the number of 
auxiliary beams. When adaptive schemes, discussed later, are used to estimate the weights, 
convergence is generally faster for these arrays. However, the MSE for these arrays is also 
high compared to fully adaptive arrays [Van91]. 

These arrays are useful in situations where the number of interferences are much less 
than the number of elements and offer computational advantage over element space 
processing, as you only need to adjust M -1 weights compared to L weights for the element 
space case with M < L. Moreover, beam space processing requires less computation than 
the element space case to calculate the weights in general as it solves an unconstrained 
optimization compared to the constrained optimization problem solved in the latter case. 
It should be noted that for the element space processing case, constraints on the weights 
are imposed to prevent distortion of the signal arriving from the look direction and to 
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make the array more robust against errors. For the beam space case, constraints are 
transferred to the main beam, leaving the adjustable weights free from constraints. 

Auxiliary beamforming techniques other than the use of a blocking matrix described 
above include formation of M - 1 orthogonal beams and formation of beams in the direc¬ 
tion of interference, if known. The beams are referred to as orthogonal beams to imply 
that the weight vectors used to form beams are orthogonal, that is, their dot product is 
equal to zero. The eigenvectors of the array correlation matrix taken as weights to generate 
auxiliary beams fall into this category. In situations where directions of arrival of inter¬ 
ference are known, the formation of beams pointed in these directions may lead to more 
efficient interference cancelation [Bro86, Gab86]. 

Auxiliary beam outputs are weighted and summed, and the result is subtracted from 
the main beam output to cancel the unwanted interference present in the main beam. The 
weights are adjusted to cancel the maximum possible interference. This is normally done 
by minimizing the total mean output power after subtraction by solving the unconstrained 
optimization problem and leads to maximization of the output SNR in the absence of the 
desired signal in auxiliary channels. The presence of the signal in these channels causes 
signal cancelation from the main beam along with interference cancelation. A detailed 
discussion on the principles of signal cancelation in general and some possible cures is 
given in [Wid75, Wid82, Su86]. 

Use of multiple-beam array processing techniques for mobile communications has been 
reported in various studies [Jon95, Sak92], including development of an array system 
using digital hardware to study its feasibility [God02]. 


2.6.1 Optimal Beam Space Processor 

It follows from the Figure 2.9 that the output of the main beam \j/(t) is given by 

\|/(t) = V H x(t) (2.6.3) 

where the L-dimensional vector V is defined as 

v = [ v i' v 2'---' v l] T (2.6.4) 

Let an M - 1 dimensional vector q(t) be defined as 

— [du 9b' • ■ ■' Tm-i] (2.6.5) 

It denotes M - 1 auxiliary beams, output of matrix prefilter B, and is given by 

q(t) = B H x(t) (2.6.6) 

Let an M - 1 dimensional vector w denote the adjustable weights of the auxiliary beams. 
It follows from Figure 2.9 that the output q(t) of the interference beam is given by 

q(t) = w H q(t) (2.6.7) 
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The output y(t) of the overall beam space processor is obtained by subtracting the inter¬ 
ference beam output from the main beam, and thus is given by 


y(t)=v(t)-ri(t) 

= \|/(t)-w H q(t) 


( 2 . 6 . 8 ) 


The mean output power P(w) of the processor for a given weight vector w is given by 


P(w) = E[y(t)y(t)* 

= e[{ y (t) - - w H q(t)} {\|/(t) - w H q(t)} *] 

(2.6.9) 

= E^(t)t|/(tf + w H q(t)q H (t)w - w H q(t)\|/(tf - \|/(t)q H (t)w 

= P„ + w h R w - w H Z - Z H w 
o qq 

where P 0 is the mean power of the main beam given by 


P 0 = V h R V (2.6.10) 

Rqq is the correlation matrix of auxiliary beams defined as 

R qq =E[q(t)q H (t)] (2.6.11) 

and Z denotes the correlation between the output of auxiliary beams and the main beam. 
It is defined as 


Z = E 


q(t)v(t) 


A substitution for q(t) and \|/(t) in (2.6.11) and (2.6.12) yields 


R qq =E[q(t)q H (t)] 

= E[B H x(t)x H (t)B] 

= b h rb 


( 2 . 6 . 12 ) 


(2.6.13) 


Z = E 


q(t)v(t) 


= E[B H x(t)x H (t)v] 

= b h rv 


(2.6.14) 
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Substituting for P 0 , R qq and Z in (2.6.9), the expression for P(w) becomes 

P(w) = V h RV + w h B h RBw - w h B h RV - V h RBw (2.6.15) 

Note that P(w) is a quadratic function of w and has a unique minimum. Let w denote 
weights that minimize P(w). Thus, it follows that 


3P(w) 

3w 


(2.6.16) 


Substituting (2.6.15) in (2.6.16) yields 

B h RBw-B h RV = 0 (2.6.17) 

As B has rank M- 1, B H RB is of full rank and its inverse exists. Thus, (2.6.17) yields 

w = (B H RB) _1 B H RV (2.6.18) 

Substituting for w = w from (2.6.18) in (2.6.15), you obtain the following expression for 
the mean output power of the optimal processor: 

P(w) = V h RV - V H RB(B H RB) _1 B H RV (2.6.19) 

Expressions for the mean output signal power may be obtained by replacing the array 
correlation matrix R by the signal only array correlation matrix R s in (2.6.15), yielding 

P s (w) = V h R s V + w H B H R s Bw - w H B H R s V - V H R s Bw (2.6.20) 


Since 


R s — Ps® 0®0 


( 2 . 6 . 21 ) 


and 


B h S 0 = 0 


( 2 . 6 . 22 ) 


it follows from (2.6.20) that 


P s (w) = V h R s V (2.6.23) 

Thus, when the blocking matrix B is selected such that B H S 0 = 0, there are no signal 
flows through the interference beam and the output signal power is present only in the 
main beam. When the main beam is taken as the conventional beam, that is. 
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(2.6.24) 



the mean output signal power of the beam space processor becomes 


p s(w) = p s 


(2.6.25) 


Note that the signal power is independent of w. 

Similarly, an expression for the mean output noise power may be obtained by replacing 
the array correlation matrix R by the noise-only array correlation matrix R N in (2.6.15), 
yielding 


P N (w) = V h R n V + w H B H R N Bw - w H B H R N V - V H R N Bw (2.6.26) 

Substituting for w = w from (2.6.18) in (2.6.26), you obtain the following expression for 
the mean output noise power of the optimal processor: 


p N (w) = v h r n v+v h rb(b h rb) 1 b h r n b(b h rb) 1 B h RV 
- v h rb(b h rb) _1 b h r n v - v h r n b(b h rb)^b h rv 

The output SNR of the optimal beam space processor then becomes 

SNR(w) = — 

1 ' Pn(w) 


(2.6.27) 


(2.6.28) 


These expressions cannot be simplified further without considering specific cases. In 
Section 2.6.3, a special case of beam space processor is considered where only one auxiliary 
beam is considered in the presence of one interference source to understand the behavior 
of beam space processors. The results are then compared with an element space processor. 
In the next section, a beam space processor referred to as the generalized side-lobe canceler 
(GSC) is considered. The main difference between the general beam space processor 
considered in this section and the GSC is that the GSC uses presteering delays. 


2.6.2 Generalized Side-Lobe Canceler 

A structure of the generalized side-lobe canceler is shown in Figure 2.10. The array is 
presteered by delaying received signals on all antennas such that the component of the 
received signal on all elements arriving from the look direction is in phase after presteering 
delays. Let a„ 1 = 1,2, ..., L denote the phase delays to steer the array in the look direction. 
These are given by 

a 1 = 27tf 0 x 1 ((|) 0 ,e 0 ) 1=1,2,..., L (2.6.29) 

Let the received signals after presteering delays be denoted by x'(t). As these are delayed 
versions of x(t), it follows that their 1th components are related by 
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FIGURE 2.10 

Generalized side-lobe canceler structure. 


x i(t) = x i(t - e o)) (2.6.30) 

This along with (2.1.9) imply that x'(t) are related to x(t) by 


x'(t) = <x(t) 


where <t> 0 is a diagonal matrix defined as 


(2.6.31) 




0 


3 )“l 


0 


0 

e i“L 


(2.6.32) 


Note that <b 0 satisfies the relation, <t> P> 0 = 1, where 1 is a vector of ones. 

These signals are used to form the main beam as well as M - 1 interference beams. The 
main beam is formed using fixed weights on all channels. These weights are selected to 
be of equal to 1 /L so that a unity response is maintained in the look direction. Let these 
be denoted by an L-dimensional vector V given by 


V = 



(2.6.33) 


The M - 1 interference beams are formed using a blocking matrix B. Let these be denoted 
by an M - 1 dimensional vector q(t), given 
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q(t) = BV(t) 

= B H <x(t) 


(2.6.34) 


where matrix B has rank M - 1 and satisfies 


B H 1 = 0 (2.6.35) 

Expressions for the main beam, interference beams, and GSC output are then, respec¬ 
tively, given by 


¥ (t) = V H <b»x(t) 


(2.6.36) 


and 


T](t) = w H q(t) 

= w H B H <l>^x(t) 


(2.6.37) 


y(t) = ¥ (t)-ri(t) 

= V H <I>Jj I x(t) - w H B H <b“x(t) 


(2.6.38) 


It can easily be verified that an expression for the mean output power of the GSC for a 
given w is given by 


P(w) = E y(t)y(t) 


* 


= E 


{v H <lf x(t) - w H B H «fx(t)} {v H <&|fx(t) - w H B H <&^x(t)}* 


(2.6.39) 


= V"o''R<t> 0 V + w h B h O^R<I> 0 Bw - V^R^Bw - w H B H <b^R<l> 0 V 
= V h RV + w h B h RBw - V h RBw - w h B h RV 


where 


R = O''RO 0 (2.6.40) 

is the array correlation matrix after steering delays. 

Comparing (2.6.15) and (2.6.39), one notes that the expression for the mean output power 
of the GSC for a given w is analogous to that given by (2.6.15), with V and R, respectively, 
given by (2.6.33) and (2.6.40) and B satisfying (2.6.35). Thus, the expression for GSC optimal 
weights is analogous to (2.6.18), with R replaced by R, that is, 

w = (B H RB) _1 B H RV (2.6.41) 
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FIGURE 2.11 

Post-beam former interference canceler structure. 


The expression for the mean output noise power of the optimal GSC then becomes 

p N (w) = v h r n v + v h rb(b h rb) _1 b h r n b(b h rb) _1 b h rv 


-V h RB| 


(b h rb)~ 


b h r n v-v h r n b 


(b h rb)~ 


(2.6.42) 


b h rv 


and the output SNR is given by (2.6.28). 


2.6.3 Postbeamformer Interference Canceler 

In this section, a processor with two beams referred to as the postbeamformer interference 
canceler (PIC) in previous studies [God86a, God89, God89a, God91] is examined in the 
presence of a look-direction signal of power pg, an interference of power p t , and uncorre¬ 
lated noise of power o£. 

As discussed previously for the general beam space processor, the two-beam processor 
processes the signals derived from an antenna array by forming two beams using fixed 
beamforming weights, as shown in Figure 2.11. One beam, referred to as the signal beam, 
is formed to have a fixed response in the look direction. The processed output of the 
second beam, referred to as the interference beam, is subtracted from the output of the 
signal beam to form the output of the PIC. 

Let L-dimensional complex vectors V and U represent the fixed weights of the signal 
beamformer and the interference beamformer, respectively. It follows from Figure 2.11 that 
the output i|/(t) of the signal beam and the output q(t) of the interference beam are, 
respectively, given by 
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and 


¥ (t) = V H x(t) 


(2.6.43) 


q(t) = U H x(t) (2.6.44) 

The output y(t) of the PIC processor is formed by subtracting the weighted output of the 
interference beam from the output of the signal beam, that is, 

y(t) = Y(t)-wq(t) (2.6.45) 

For a given weight w, the mean output power P(w) of the PIC processor is given by 

P(w) = V h RV + w*wU h RU - w‘V h RU - wU h RV (2.6.46) 


2.6.3.1 Optimal PIC 

Let w represent the complex weight of the interference channel of the PIC that minimizes 
the mean output power of the PIC for given beamformer weights V and U. This weight w 
is referred to as the optimal weight, and the PIC with this weight is referred to as the 
optimal PIC. 

From the definition of the optimal weight, it follows that 


3P(w) 

3w 


which along with (2.6.46), implies that 


V h RU 

U h RU 


(2.6.47) 


(2.6.48) 


The mean output power of the optimal PIC is given by 


P(w) = V h RV-U h RVV h RU/U h RU (2.6.49) 

In the following discussion, three different beamformer weights for the interference 
beam are considered. For these cases, the expressions for the signal power, residual inter¬ 
ference power, and uncorrelated noise power at the output of the optimal PIC are derived 
in [God89a]. For the three cases considered, it is assumed that the signal beam is formed 
using the conventional beamforming weights, that is. 


V = ^ (2.6.50) 

This choice of beamformer weights for the signal beam ensures that the response of the 
signal beam in the signal direction is unity. 
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2.6.3.2 PIC with Conventional Interference Beamformer 

Let the interference beam be formed with the beamforming weights, 

U = S[ (2.6.51) 

This choice of beamforming weights ensures that the response of the beam in the inter¬ 
ference direction is unity. Note that these weights are not constrained to block the look 
direction signal passing through to the interference beam as was done using blocking 
matrix B in the previous discussion. This particular interference beam highlights the effect 
of the signal present in the auxiliary beams. 

It follows from (2.6.50) and (2.6.51) that the response of the interference beam in the 
signal direction is the same as that of the signal beam in the interference direction. This 
implies that a large amount of the signal power leaks into the interference beam. This 
leads to a substantial amount of signal suppression and the presence of residual interfer¬ 
ence when the PIC is optimized. This aspect of the PIC is now considered and expressions 
for the mean output signal power and the mean output noise power of the optimal PIC 
are presented. 

Substituting for U and V in (2.6.48), you obtain an expression for the weight w c of the 
optimal PIC using the conventional interference beamformer (CIB): 


Substituting for R, this leads to 


w„ 


s' lR s, 

SfRS : 


w=(3- 


l + P. + ^n 

Ps L P 


2 1 


S J 


,2 1 


1 + ^ 


Ps L P 


s J 


(2.6.52) 


(2.6.53) 


where p is a normalized dot product of S 0 and Sj, defined as p = S'JS,/L. 

Substituting for R equals R s , R t , R n and R N , when w = w c in (2.6.46), the following 
expressions are obtained for the output signal power, residual interference power, uncor¬ 
related noise power, and output noise power, respectively: 


P sK) = P s P 2 /( 1 + a i) 2 


p .K) 


PiP 2 /(i-p) 


1 I Pl/Ps + a n7 L Ps 
" (1-p) 


(2.6.54) 

(2.6.55) 
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(2.6.56) 
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and 


P n(w c ) 


°nP ! P 2 Pl+°n7 L 
L 1-p (l + l/a,) 2 


where 


«I 


( 1 ~P)P S 

(Pl + °n7 L ) 


(2.6.57) 


(2.6.58) 


is the SNR at the output of the interference beam. Since the SNR is a positive quantity 
and the parameter p is not more than unity, it follows from (2.6.54) that the signal power 
at the output of the optimal PIC using the CIB is less than the signal power at the output 
of the signal beam. Hence, the signal has been suppressed by the PIC. Furthermore, the 
signal suppression increases as (l)the parameter p, which depends on the array geometry 
and the relative directions of the two sources, decreases, and (2) the SNR at the output of 
the interference beam increases. 

Since the SNR at the output of the interference beam is proportional to the input signal 
power, it follows that signal suppression increases as the input signal power increases. 
On the other hand, an increase in the interference power as well as the uncorrelated noise 
power at the input of the PIC decreases the SNR at the output of the interference beam 
and, hence, decreases the signal suppression of the optimal PIC using the CIB. 

Physically, the signal suppression by the optimal PIC using the CIB arises from the 
leakage of the signal into the interference beam. The component of the signal in the 
interference beam is subtracted from the signal in the signal beam; in the process of 
minimization of total output power, this leads to signal suppression. Signal suppression 
increases as the parameter p decreases. The reason for this is that as p decreases, the 
response of the interference beam in the signal direction increases, which increases the 
signal leakage into the interference beam, causing more signal suppression. 

To understand the dependency of the signal suppression on cq, the SNR at the output 
of the interference beam, rewrite (2.6.53) as 


( 




w„ = 


1 + 


1-p 


1 + 


a 


i J 


(2.6.59) 


It follows from (2.6.59) that as oq increases, the magnitude of w c increases, resulting in an 
increase of the signal suppression. In the limit, as oq —»<=°, w c -> p/(l - p). It can easily be 
verified that for this value of w c , the output signal power reduces to zero, resulting in 
total signal suppression. 

The behavior of the output noise power of the optimal PIC using the CIB is described 
by (2.6.57). The first term, which is proportional to the uncorrelated noise power at the 
input of the PIC, decreases as the number of elements in the array increases and the 
parameter p decreases. The second term, which is proportional to the total noise power 
at the output of the interference beam, also decreases as the parameter p decreases and 
depends on oq. As cq increases, resulting in an increase of w c , the second term on the right 
side of (2.6.57) increases. This implies that the output noise power of the optimal PIC 
using the CIB increases as the input signal power increases. 
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FIGURE 2.12 

Output SNR of the PIC using CIB vs. input SNR for a ten-element linear array, 0 O = 90°, p, = 1, 0, = 30°. (From 
Godara, L.C., J. Acoust. Soc. Am., 85, 202-213, 1989 [God89a]. With permission.) 


Let SNR(w c ) denote the output SNR of the optimal PIC using the CIB. Then, it follows 
from (2.6.54) and (2.6.57) that 


SNR(w c ) 


_ P( 1 ~P)P S _ 

(1 - p)(l + a : ) 2 (a n 2 /L) + pa, 2 ( Pl + o 2 /l) 


(2.6.60) 


For the special case when the noise environment consists of only directional sources, 
that is, when o 2 = 0, (2.6.60) reduces to 


SNR(w c ) = 1/a, (2.6.61) 

which agrees with the results presented in [Wid75, Wid82] that in the absence of uncor¬ 
related noise, the output SNR of an interference canceler is inversely proportional to the 
input SNR. In the presence of uncorrelated noise power, the behavior of SNR(w c ) is shown 
in Figure 2.12. 

The results in Figure 2.12 are for an equally spaced linear array of ten elements, with 
inter-element spacing of one-half wavelength. The signal source is assumed to be broad¬ 
side to the array, and an interference source of unity power is assumed 60° off broadside. 
For this array configuration and source scenario, the parameter p is equal to 0.99. Figure 
2.12 shows that the presence of uncorrelated noise changes the behavior of SNR(w c ) 
dramatically, particularly for low-input SNR. In the absence of uncorrelated noise, the PIC 
using the CIB is able to cancel most of the interference when the input SNR is small, 
resulting in high-output SNR. The presence of uncorrelated noise increases the total output 
noise significantly (see Equation 2.6.57), resulting in a substantial drop in the output SNR. 
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2.6.3.3 PIC with Orthogonal Interference Beamformer 

Let the interference beam be formed using the beamforming weights 


U = U o (2.6.62) 

where U 0 is a complex vector such that 

U»S 0 = 0 (2.6.63) 


The constraint specified by (2.6.63) ensures that the interference beam has a null in the 
signal direction. Thus, the interference beam does not contain any signal and the PIC using 
the orthogonal interference beamformer (OIB) does not suppress the signal. Note that the 
vector U 0 may be a steering vector. This case corresponds to the parameter p taking on a 
value of unity. 

Various expressions for optimal PIC using the OIB are now presented. It is assumed 
that the interference beam of the PIC using the OIB does not have a null in the interference 
direction. If the interference beam had a null in the interference direction, then there would 
be no interference present in this beam and no reduction in the interference from the signal 
beam would result by forming the PIC output by subtracting the weighted output of the 
interference beam from the signal beam. 

From (2.6.48), (2.6.50) and (2.6.62), it follows that the optimal weight w D of the PIC using 
the OIB is given by 


w o 


SqRUq 

LU«RU o 


(2.6.64) 


Substituting for R in (2.6.64), one obtains, after manipulation. 



S“S : S»U 0 

° L2 Po(Yo+ a n7 L Pl) 

(2.6.65) 

where 

3 =u H u 

ro O O 

(2.6.66) 

and 

u l[ s,s!'u 

Y — oil o 

/o lu h u 

(2.6.67) 


Note that y ( „ as defined by (2.6.67), is a positive real scalar, with 


0 < y o < 1 (2.6.68) 

and represents the normalized power response of the interference beam in the direction 
of the interference. 

The expressions for the signal power, the residual interference power, the uncorrelated 
noise power, and total noise power at the output of the optimal PIC using the OIB are, 
respectively, obtained by substituting for R equals R s , Rj, R n and R N , and w = w 0 in (2.6.46). 
These are given by 
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P sK) = Ps 


(2.6.69) 


p .K) 


Pi^-P) 

[ 1 + To( L P I /°n)] 2 


(2.6.70) 


and 


P nK) 



( 1 ~p)Yo 

(Yo+°n7 L Pl) 2 


(2.6.71) 


P n(w o ) 



(1-P) 

Yo + °n/ L Pl 


(2.6.72) 


From expressions (2.6.69) to (2.6.72), the following observations can be made: 

1. The optimal PIC using the OIB does not suppress the signal. This is because there 
is no leakage of the signal into the interference beam. 

2. The residual interference power of the optimal PIC using the OIB depends on 
Pi/Op. For a given array geometry and noise environment, the normalized residual 
interference power P(w 0 )/pi decreases as pj/a£ increases. In a noise environment 
with a very high pj/aJ, the residual interference power of the optimal PIC using 
the OIB becomes very small. In the limit, as 


Pl 


—^ oo 




SoS t sX 

l 2 P„y 0 


(2.6.73) 


which lead to full cancelation of the interference (see Equation 2.6.70). On the 
other hand, as 


4^0 w o -»0 (2.6.74) 

°n 

and no cancelation of the interference takes place. 

3. The uncorrelated noise power at the output of the PIC is more than the uncorrelated 
noise power at the output of the signal beam. This follows from (2.6.71). The RHS 
of (2.6.71) consists of two terms. The first term is the same as the uncorrelated noise 
power at the output of the signal beam and the second term is proportional to the 
uncorrelated noise power at the output of the signal beam; the proportionality 
constant in the square brackets depends on the pi/aj. As pj/a£ increases, the 
quantity in the square brackets increases. This is due to the fact that w 0 increases 
as pi/cj‘ increases. In the limit, the maximum increase in the uncorrelated noise 
power caused by the optimal PIC using the OIB is e> j?/L (1 - p)/y 0 . 

4. The total noise power P ^w 0 ) at the output of the optimal PIC using the OIB does 
not depend on the signal power. It is proportional to the uncorrelated noise power 
at the output of the signal beam and decreases as pj/On decreases. The uncorre¬ 
lated noise dominates the total noise at the output of the optimal PIC. 
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FIGURE 2.13 

Output SNR of the PIC using OIB vs. input SNR for a ten-element linear array, 0 O = 90°, p, = 1, 0j = 30°. (From 
Godara, L.C., J. Acoust. Soc. Am., 85, 202-213, 1989 [God89a], With permission.) 

Now the output SNR of the optimal PIC using the OIB is examined. Let this ratio be 
denoted by SNR(w 0 ). It follows from (2.6.69) and (2.6.72) that 


SNR(w o ) 




(2.6.75) 


Thus, the output SNR of the optimal PIC using the OIB is proportional to the number of 
elements and p s /o„; and depends on pi/On- As 


Pl 


—^ oo 


SNR(w o ) 


, L Ps Yo 

<*n ( 1 + Yo-P) 


(2.6.76) 


Figure 2.13 shows SNR(w 0 ) vs. input SNR for various Pi/o(p The array geometry and 
noise environment used for this example is the same as that used for Figure 2.12. The 
interference beam is formed using the steering vector in the endfire direction. The param¬ 
eter y 0 for this case is 0.17. From Figure 2.13, for a given input SNR the output SNR 
increases as p t /ol increases. 


2.6.3.4 PIC with Improved Interference Beamformer 

As discussed in previous sections, the output of the optimal PIC contains residual inter¬ 
ference power and uncorrelated noise power. This section presents and analyzes the 
optimal PIC using an interference beamformer that eliminates all interference in the output 
while simultaneously reducing the contribution of uncorrelated noise in the output. For 
this case, let the interference beam be formed with the beamforming weights 
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(2.6.77) 


U= 

SfR _1 S, 

Note that the above expression is similar to the expression for the weights of constrained 
optimal beamformer except that in this case the beam is constrained in the direction of 
the interference rather than the look direction. Thus, it can easily be verified that the 
interference beam formed with these weights has unity response in the interference direction 
and has a reduced response in the signal direction. The response of the interference beam 
in the signal direction depends on the signal source power and uncorrelated noise power. 
It can be shown that this choice of beamforming weights minimizes the sum of signal 
power and uncorrelated noise power in the interference channel output. 

A substitution for V and U in (2.6.48) from (2.6.50) and (2.6.77), respectively, leads to 
the following expression for w lr the weight of the optimal PIC using the improved inter¬ 
ference beamformer (IIB): 

w, =S''S,/L (2.6.78) 

It follows from (2.6.78) that the weight, which minimizes the output power of PIC using 
the IIB is independent of the signal, the interference, and the uncorrelated noise powers. 
This weight depends only on the array geometry and relative directions of the two sources. 

The expressions for the signal power and the noise power at the output of the optimal 
PIC using the IIB are, respectively, given by 


and 


P s(^r) = PsP 2 


(l + °n/Lp S ) V 
v (P + 0 n/LPs) y 


P n(w : ) 



f 

*P 

V 


(l+ a n/LPs) 
( P + a n/LPs) 


\ 2 ' 


(2.6.79) 


(2.6.80) 


One observes from expressions (2.6.79) and (2.6.80) that the output signal power and the 
output noise power of the optimal PIC using the IIB are independent of the interference 
power. Thus, the optimal PIC using the IIB has completely suppressed the interference. 
Furthermore, the output signal power and output noise power depend on a ,^Lp s (ratio 
of uncorrelated noise power to signal power at signal beam output). The output signal 
power increases as o^Lp s decreases, and approaches the input signal power in the limit. 
Thus, in the presence of a strong signal source, the signal suppression by the optimal PIC 
using the IIB is negligible. The signal suppression becomes further reduced as the number 
of elements in the array is increased. 

The total noise power at the output of the optimal PIC using the IIB is equal to the 
uncorrelated noise power at the output of the signal beam when p = 1. To investigate the 
effect of o , 2 / Lp s on the output noise power when p < 1, you can rewrite the quantity in 
the braces on the right side of (2.6.80) in the following form: 
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(l+ a n/ L P S ) 
(p + a n/ L Ps) 


\ 2 


J 


1 + (1-P) 


p~K7 l p s ) 2 

(p + a n/ L Ps) 2 


(2.6.81) 
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TABLE 2.1 


Comparison of Normalized Signal Power, Interference Power, Uncorrelated Noise Power and SNR at 


the Output of the Optimal PIC Forming Interference Beam with CIB, OIB and 11B, y 0 

s»s,s»s 0 

p L 2 


LU 0 H U 0 


and 
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noise power 
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Since p < 1, it follows from (2.6.81) that the second term on the RHS is negative if p 
< (a 2 /Lp s ) 2 . Thus, under this condition the quantity in the braces on the right side of 
(2.6.80) is less than unity and, hence, the uncorrelated noise power at the output of the 
PIC is less than the uncorrelated noise power at the output of the signal beam. Thus, the 
optimal PIC using the IIB reduces the uncorrelated noise when p < (o 2 /Lp s ) 2 . On the other 
hand, when p > (o 2 /Lp s ) 2 , the quantity in the braces on the right side of (2.6.80) is more 
than unity and the optimal PIC using the IIB increases the uncorrelated noise power. Note 
that at the output of the optimal PIC using the IIB, total noise consists of uncorrelated 
noise only: it increases as o 2 /Lp s decreases and in the limit approaches o 2 /Lp s . 

Now the output SNR of the optimal PIC using the IIB is examined. Let this ratio be 
denoted by SNR(wj). It follows then from (2.6.79) and (2.6.80) that 

SNR(w.) = (2.6.82) 

Thus, the output SNR of the optimal PIC using the IIB is proportional to the input signal 
to uncorrelated noise ratio, the number of elements in the array, and the parameter p. 

2.6.3.5 Discussion and Comments 

A comparison of the various results is presented in Table 2.1. The output signal power, 
residual interference power, and output uncorrelated noise power of the optimal PIC are, 
respectively, normalized by pg, p : (l - p), and o 2 /L. These quantities correspond to the 
signal power, the interference power, and the uncorrelated noise power at the output of 
the signal beam. This particular form of normalization is chosen to facilitate the compar¬ 
ison between the performance of the PIC using the OIB, IIB, and CIB, and that of an 
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element space processor using conventional weights (the signal beam is formed using 
conventional weights). 

It follows from Table 2.1 that the SNR of the optimal PIC for the three cases is the same 
when p is equal to unity or, equivalently, when the steering vectors in the signal and 
interference directions are orthogonal to each other. The case of p < 1 is now considered. 
For this situation, the results of the optimal PIC with the three interference beamformers 
are discussed and some examples are presented. All examples presented here are for a 
linear array of ten equally spaced elements with one-half wavelength spacing. The signal 
direction is broadside to the array, and the uncorrelated noise power on each element is 
equal to 0.01. The interference beam for the OIB case is formed using the steering vector 
in the endfire direction. Thus, knowledge of the interference direction is not used in 
selecting U 0 . 

2.6.3.5.1 Signal Suppression 

From Table 2.1, the following observations about the normalized output signal power of 
the optimal PIC for the three cases can be made: 


1. The optimal PIC using the OIB does not suppress the signal; in the other two 
cases the signal is suppressed. The signal suppression by the optimal PIC using 
the CIB is more than that by the PIC using the IIB. This follows from the following 
expression for the difference of the normalized output signal powers: 

P s(w c ) P s (wi) _ 

Ps Ps 

(2 6 83) 

_ 2 [(P + °7 L P S ) + ft '+ <*n A-PsK 1 + «l)] [P ~ P) + «l(l + °7 L P S )] 

P ( 1 + «l) 2 (P + °n/ L Ps) 2 

Physically, the interference beam rejects more of the signal in the IIB than in the 
CIB and rejects all of the signal in the OIB. This leads to no suppression of signal 
by the PIC using the OIB and less suppression in the case of the IIB than that of 
the CIB. 

2. The normalized output signal power of the optimal PIC using the IIB is indepen¬ 
dent of the interference power. In the case of the optimal PIC using the CIB, it 
increases as the interference power increases. Thus, it follows that the difference 
between the normalized output signal power for the two cases decreases as the 
interference power increases. In the limit the difference approaches 


-p 2 (l-p) 


1+ p+ 2 K7 l p s ) 

(p + a n/LPs)~ 


3. The normalized output signal power depends on the input signal power for both 
the CIB and IIB cases. In the case of the optimal PIC using the CIB, it decreases 
as the input signal power increases. Thus, the signal suppression increases as the 
input signal power increases. Flowever, in the case of the optimal PIC using the 
IIB, the normalized output signal power increases as the input signal power 
increases, approaching unity in the limit. Thus, the signal suppression is negligibly 
small when the input signal to uncorrelated noise ratio is large. 
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FIGURE 2.14 

Normalized output signal power of the PIC using the OIB with p t = 1; the IIB with pj = 1; and the CIB with p, = 
1, 0.1 and 0.01 vs. input signal power for a ten- element linear array, 0 O = 90°, oj = 0.01, 0, = 30°. (From Godara, 
L.C., J. Acoust. Soc. Am., 85, 202-213, 1989 [God89a], With permission.) 

Figure 2.14 and Figure 2.15 show plots of the normalized output signal power of the 
optimal PIC using OIB and IIB when the interference power is 1.0 and using the CIB when 
the interference powers are 0.01, 0.1, and 1.0. For Figure 2.14, the interference is at an 
angle of 60° from the signal while for Figure 2.15, the angle is at 5°. The parameter p for 
these cases is 0.99 and 0.48, respectively. Note that for both the cases the normalized output 
signal power of the PIC using the CIB increases as the interference power increases. Signal 
suppression by the PIC using the CIB increases as the input signal power increases in both 
cases, but the signal suppression is greater in Figure 2.15 (p = 0.48). This is because more 
signal leaks into the interference beam for the scenario of Figure 2.15 than for Figure 2.14. 

2.63.5.2 Residual Interference 

The following observations about the residual interference can be made: 

1. The output of the optimal PIC using the IIB does not contain any residual inter¬ 
ference; in the OIB and CIB cases, residual interference is present. 

2. For the optimal PIC using the OIB, the normalized output residual interference 
depends on p /o^and the number of elements in the array. As p /a^increases, 
the normalized residual interference decreases and approaches zero in the limit. 

As this ratio decreases, the normalized residual interference increases but never 
exceeds unity. Thus, the optimal PIC using the OIB always cancels some of the 
interference present at the output of the signal beam. The interference cancelation 
increases as p /o„ and the number of elements in the array increase. 

3. As presented in Table 2.1, the expression for the normalized residual interference 
at the output of the optimal PIC using the CIB is a product of two terms. The first 
term depends on the parameter p, which in turn is controlled by the array geom¬ 
etry and the relative directions of the two sources: for p greater than one-half, the 
term exceeds unity. The second term depends on a^/Lpgand pi/ps, and increases 
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FIGURE 2.15 

Normalized output signal power of the PIC using the OIB with p t = 1; the IIB with pj = 1; and the CIB with p, = 
1, 0.1 and 0.01 vs. input signal power for a ten-element linear array, 0 O = 90°, =0.01, 0j = 85°. (From Godara, 

L.C., J. Acoust. Soc. Am., 85, 202-213, 1989 [God89a], With permission.) 

as these parameters decrease (stronger signal), in the limit approaching unity. It 
follows that the normalized residual interference at the output of the optimal PIC 
using the CIB increases as the signal power increases, and approaches a limit that 
is more than unity when p < 0.5. Thus, in certain cases, the interference power at 
the output of the optimal PIC using the CIB may be more than the interference 
power at the output of the signal beam. 

Comparisons of the normalized residual interference at the output of the optimal PIC 
using the CIB and OIB are shown in Figure 2.16 and Figure 2.17. The interference directions 
are 5° and 60° off broadside, respectively. The signal power is assumed to be unity. These 
figures show plots of the interference power at the output of the optimal PIC normalized 
by the interference power at the output of signal beam. Thus, the interference level above 
the 0 dB line indicates an increase in the interference power from that present in the signal 
beam. 

Figure 2.16 (the interference and signal are 5° apart, p = 0.48) shows that the optimal 
PIC in both cases cancels some interference present in the signal beam. However, the 
cancelation is very small for the lower range of the input interference and increases as the 
input interference increases. For the lower range of the input interference power, the optimal 
PIC using the CIB cancels slightly more interference than that using the OIB. The reverse 
is true at the other end of the input interference range. The optimal PIC using the OIB 
cancels about 10 dB more interference than that using the CIB when the input interference 
power is unity. 

Figure 2.17 shows the normalized output interference of the optimal PIC using the OIB 
and CIB when the interference and the signal are 60° apart (p = 0.99). The figure shows 
that for the lower range of the input interference, the residual interference at the output 
of the optimal PIC using the CIB is about 40 dB more than the interference contents in 
the signal beam. Thus, the optimal PIC using the CIB does not suppress weak interference, 
but increases its level. In the case of the optimal PIC using the OIB, when the input 
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FIGURE 2.16 

Normalized residual interference power of the PIC using the OIB and the CIB vs. input interference power for 
a ten-element linear array, 0 O = 90°, p s = 1.0, = 0.01, 9, = 85°. (From Godara, L.C., J. Acoust. Soc. Am., 85, 

202-213, 1989 [God89a]. With permission.) 



FIGURE 2.17 

Normalized residual interference power of the PIC using the OIB and the CIB vs. input interference power for 
a ten-element linear array, 0 O = 90°, p s = 1.0, = 0.01, 9, = 30°. (From Godara, L.C., J. Acoust. Soc. Am., 85, 

202-213, 1989 [God89a]. With permission.) 


interference power is very small, some interference reduction takes place. The reduction 
is about 2 dB. 

For both cases, the normalized output interference decreases as the input interference 
power increases. For the entire range of input interference level, the residual interference 
at the output of the optimal PIC using the CIB is about 42 dB more than that using the OIB. 
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2.6.3.5.3 Uncorrelated Noise Power 

A comparison of the normalized uncorrelated noise power at the output of the optimal 
PIC for the CIB, OIB, and IIB is shown in Table 2.1. The table shows that the normalized 
uncorrelated noise power at the output of the optimal PIC using the OIB is greater than 
unity. In other words, the optimal PIC has increased the uncorrelated noise. 

For the case of the optimal PIC using the IIB, the decrease or increase in the uncorrelated 
noise power depends on the difference between the parameter p and the square of the 
uncorrelated noise to signal ratio at the output of the signal beam (c 2 /Lp s ) 2 . The normalized 
uncorrelated noise power at the output of the PIC is more than unity when p > (a 2 /Lp s ) 2 . 
Thus, in the presence of a relatively stronger signal source, the optimal PIC using the IIB 
increases the uncorrelated noise power. 

2.6.3.5.4 Signal-to-Noise Ratio 

First a comparison between the SNRs of the PIC using the IIB and OIB is considered. It 
follows from (2.6.75) and (2.6.82) that 



a 2 

snr(w) i Y ° + ip; 

SNR(w,) p j£ + 1 

" Lp, 

(2.6.84) 

which implies that 

SNR(wj) > SNR(w o ) 

(2.6.85) 

Furthermore, for p = 1 

SNR(w o ) = SNR(wj) 

(2.6.86) 

Now consider the PIC using the IIB and CIB. It follows from (2.6.60) and (2.6.82) that 


SNR(w) l 

(2.6.87) 


SNRK) (l + oq) + i P p a 1 2 (l + Lp 1 /a n 2 ) 

Thus 

SNR(wj) > SNR(w c ) 

(2.6.88) 

Furthermore, for low values of oq 



SNR(wj) = SNR(w c ) 

(2.6.89) 

Note that 

„ ( 1_p )Ps 

1 (Pi + a n/P) 

(2.6.90) 


is the SNR at the output of the interference beam. 
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FIGURE 2.18 

Output SNR of the PIC using the OIB, the IIB and the CIB vs. input SNR for a ten-element linear array, 0 O = 90°, 
p, = 1.0, oj = 0.01, 9, = 30°. (From Godara, L.C., /. Acoust. Soc. Am., 85, 202-213,1989 [God89a]. With permission.) 



FIGURE 2.19 

Output SNR of the PIC using the OIB, the IIB and the CIB vs. input SNR for a ten-element linear array, 0 O = 90°, 
p s = 1.0, = 0.01, 0, = 85°. (From Godara, L.C., J. Acoust. Soc. Am., 85, 202-213,1989 [God89a]. With permission.) 


The above discussion agrees with the comparison of the output SNRs for the IIB, OIB, 
and CIB cases shown in Figure 2.18 and Figure 2.19. For these cases, a unit power inter¬ 
ference is assumed to be present. The direction of the interference is 60° from broadside 
in Figure 2.18 and 5° from broadside in Figure 2.19. The parameter p is 0.99 and 0.48, 
respectively, and the parameter y 0 is 0.17 and 0.01, respectively. One observes from these 
figures that in the case of the CIB, the output SNR decreases as the input SNR increases 
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beyond -8 dB in Figure 2.18 and beyond -16 dB in Figure 2.19. However, in the other two 
cases the output SNR increases as the input SNR increases, resulting in array gains of the 
order of 20 to 30 dB. 

In the next two sections, a comparison of the optimal element space processor (ESP) 
and the optimal PIC with an OIB is presented. It should be noted that the ESP is optimized 
to minimize the mean output power subject to a unity constraint in the look direction and 
the PIC is optimized to minimize the mean output power with the interference beam 
having a null in the look direction. 


2.6.4 Comparison of Postbeamformer Interference Canceler with Element 
Space Processor 

Performance of the optimal ESP is a function of p, and the performance of the optimal 
PIC with an OIB is dependent on p and y 0 . Thus, performance comparison of the two 
processors depends on the relative values of these two constants. 

First, consider a case where the precise interference direction is known. Let the inter¬ 
ference beam be formed using an OIB given by 

U o =PS! (2.6.91) 

where 

P = I-(S 0 S«)/L (2.6.92) 

A simple calculation indicates that for the interference beamformer weights given by 
(2.6.91) and (2.6.92), y 0 attains its maximum value and 

Yo = P (2.6.93) 

A comparison of the results derived in Sections 2.4 and 2.6.3 reveals that for this case 
the output powers and the SNRs of the two processors are identical (see (2.4.42) and 
(2.6.75)). Thus, if the interference beam of the PIC is formed by an OIB for which (2.6.93) 
holds, then the performance of the optimal PIC is identical to the performance of the 
optimal ESP. 

However, if the interference beam of the PIC is formed by an OIB for which 

To < P (2.6.94) 

then a comparison of the results for the two processors (an expression for P N results using 
(2.4.20) and (2.4.41) reveals that 


P N (w o )>P N (2.6.95) 

and 

SNR(w o )< & (2.6.96) 

Thus, the total noise power at the output of the optimal PIC in this case is more than 
the total noise power at the output of the optimal ESP, and the SNR achievable by the 
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FIGURE 2.20 

Difference in the SNRs of the two processors calculated using (2.6.98) as a function of p and y 0 . (From Godara, 
L.C., IEEE Trans. Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 

optimal PIC is less than that achievable by the optimal ESP. It follows from (2.4.42) and 
(2.6.75) that the ratio of the two SNRs is given by 


SNR(Wq) _ O n 2 +L Pl °n +Tq L Pi 
SNR( w) o n 2 + (l + Y 0 -p)L Pl o n 2 + pL Pl 

For o 2/Lp! <§ y 0 , this ratio reduces to 

SNR(w o ) y o 
SNR( w) = (l + Y 0 °-p)p 


(2.6.97) 


(2.6.98) 


and depends on the relative values of p and y 0 . Furthermore, if p = 1, then it follows from 
(2.6.98) that the output SNRs of the two processors are approximately the same. Plots of 
(2.6.98) for four values of p as a function of Y 0 are shown in Figure 2.20. The figure shows 
that the difference in the output SNRs of the two processors is smaller for the larger values 
of these constants and increases as these constants decrease. 


2.6.5 Comparison in Presence of Look Direction Errors 

Knowledge of the look direction is used to constrain the array response in the direction 
of the signal such that the signal arriving from the look direction is passed through the 
array processor undistorted. The array weights of the element space optimal beamformer 
are estimated by minimizing the mean out power subject to the look direction constraint. 
The processor maximizes the output signal to noise ratio by canceling all interference. A 
direction source is treated as interference if it is not in the look direction. This shows the 
importance of the accuracy of the look direction. An error occurs when the look direction 
is not the same as the desired signal direction. For this case, the processor treats the desired 
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signal source as interference and attenuates it. The amount of attenuation depends on the 
power of the signal and the amount of error [Ows73, Cox73, God87, Zah72], A stronger 
signal is canceled more and a larger error causes more cancelation of the signal. 

The solution to the look direction error, also known as the beam-pointing error, is to 
make the beam broader so that when the signal is not precisely in the direction where it 
should be (the look direction), its cancelation does not take place. The various methods 
of broadening the beam include multiple linear constraints [Ows73, Ste83] and norm 
constraints. Norm constraints prohibit the main beam blowing out as is the case in the 
presence of pointing error. In the process of canceling a source close to the point constraint 
in the look direction, the array response gets increased in the direction opposite to the 
pointing error. A scheme to reduce the effect of pointing error, which does not require 
broadening of the main beam, has been reported in [Pon96]. It makes use of direction 
finding techniques combined with a reduced dimensional maximum likelihood formula¬ 
tion to accurately estimate the direction of the desired signal. The effectiveness of this 
scheme in mobile communications has been demonstrated using computer simulations. 
Other schemes to remedy pointing error problems may be found in [Lo90, Muc81, Roc87]. 

In this section, the performance of the optimal element space processor and the beam 
space processor in the presence of beam-pointing error is compared [God87]. The com¬ 
parison presented here indicates that beam space processors in general are more robust 
to pointing errors than elements space processors. 

It is assumed for this analysis that the actual signal direction is different from the known 
signal direction. Let the steering vector in the actual signal direction be denoted by S 0 . 
The array correction matrix R in this case is given by 

R = p s S 0 S»+p I S I S I H + a n 2 I (2.6.99) 

and the weights w-of the optimal ESP and w 0 of the optimal PIC with an OIB estimated 
from the known signal direction are given by 

w = (r- 1 s 0 )/s»r- 1 s 0 

and 

w =(v h ru )/u h ru 

O V o// o o 

where V is given by (2.6.50) and U 0 satisfies (2.6.63). 

The output power P of the ESP is given by 

P = w h Rw (2.6.102) 

and the output power P(w 0 ) of the PIC processor is given by 


( 2 . 6 . 100 ) 

( 2 . 6 . 101 ) 


P(w o ) = V h RV+ w;w o U»RU c - w o U»RV - w:v H RU D (2.6.103) 

A detailed comparative study of the performance of the two processors in the presence 
of the signal direction error (SDE) is presented in Figure 2.21 to Figure 2.28. These figures 
show how the direction of the interference source, the number of elements in the array, 
and the uncorrelated noise power level affect the performance of the two processors as a 
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FIGURE 2.21 

Output signal power vs. the SDE for a ten-element linear array, p, = 100, = 0.01, 0, = 85°. (From Godara, L.C., 

IEEE Trans. Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 



FIGURE 2.22 

Output uncorrelated noise power vs. the SDE for a ten-element linear array, p, = 100, = 0.01, 0j = 85°. (From 

Godara, L.C., IEEE Trans. Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 

function of the error in the signal direction. For all these figures, interference power is 
taken to be 20 dB more than signal power. 

Figure 2.21 to Figure 2.24 show, respectively, the comparison of the output signal powers, 
the output uncorrelated noise powers, the power patterns, and the output SNRs of two 
processors when the assumed look direction is broadside to a ten-element linear array 
with half-wavelength spacing. The direction of the interference is 85° relative to the line 
of the array, and the uncorrelated noise power level is 20 dB below the signal level. The 
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FIGURE 2.23 

Power pattern of a ten-element linear array when SDE = 1°, p[ = 100, = 0.01, 9, = 85°. (From Godara, L.C., 

IEEE Trans. Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 



FIGURE 2.24 

Output SNR vs. the SDE for a ten-element linear array, p, = 100, oj = 0.01, 9j = 85°. (From Godara, L.C., IEEE 
Trans. Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 


interference beamforming weights of the PIC processor are calculated using (2.6.91) and 
(2.6.92). 

Figure 2.21 shows that the output signal powers of the two processors are the same in 
the absence of the SDE. As the SDE increases, the signal suppression by the ESP increases, 
and it suppresses more than 11 dB signal power in the presence of a 1° error in the signal 
direction. Note that the error in the signal direction is measured relative to the look 
direction and is assumed to be positive in the counterclockwise direction. Thus, -1° and 


© 2004 by CRC Press LLC 





















1° error, respectively, means that the signal direction is 89° and 91° relative to the line of 
the array. Furthermore, the line of the array, interference direction, and signal direction 
are in the same plane. 

The signal suppression of the PIC processor is substantially less than that of the ESP. It 
reduces the output signal power less than 2 dB in comparison to 11 dB of the ESP when 
the error is -1° and increases the output signal power by about 1 dB when the error is 1°, 
in which case the ESP suppresses more than 13 dB of signal. 

A comparison of the uncorrelated noise powers of the two processors is shown in 
Figure 2.22. This figure shows that there is no noticeable effect on the output uncorrelated 
noise power of the PIC processor due to the presence of the SDE. However, there is a 
significant increase in the uncorrelated noise output power of the ESP. A small SDE, of 
the order of 0.4°, causes an increase of the order of 20 dB in the uncorrelated noise output 
power. 

Figure 2.23 shows the power patterns of the two processors when the error is 1°. The 
reduced response in the signal direction and an increased response to the uncorrelated 
noise are clearly visible from the pattern of the ESP. 

Figure 2.24 compares the output SNRs of the two processors. The performance of the 
two processors is the same in the absence of errors. The effect of the SDE on the output 
SNR of the PIC is a slight reduction for a -1° error and a slight increase for a 1° error. 
However, the error causes a significant reduction in the output SNR of the ESP. 

Figure 2.25 compares the output SNRs of the two processors when the interference 
direction is 25° relative to the line of the array. This figure demonstrates that the output 
SNR of the ESP in Figure 2.25 is reduced by more than 20 dB by 0.1° error in the signal 
direction. On the other hand, the effect of the SDE on the output SNR of the PIC is 
negligibly small. It should be noted that the constant p attains values of 0.99 and 0.48, 
respectively, for the scenarios of Figure 2.24 and Figure 2.25. A comparison of these figures 
shows how the direction of the interference affects the output SNR of the ESP for a given 
SDE. One observes that the performance of the ESP in a noise configuration with a higher 
value of p is poorer than that with a lower value of p. 



FIGURE 2.25 

Output SNR vs. the SDE for a ten-element linear array, p, = 100, a* = 0.01, 8, = 25°. (From Godara, L.C., IEEE 
Trans. Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 
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FIGURE 2.26 

Output SNR vs. the SDE for a ten-element linear array, p, = 100, a; =1,0, = 25°. (From Godara, L.C., IEEE Trans. 
Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 

Figure 2.26 shows the output SNR plots of the two processors when the uncorrelated 
noise level is raised to that of the signal level. Other noise and array parameters are the 
same as in Figure 2.25. The effect of the raised uncorrelated noise level on the ESP in the 
presence of the SDE is that the processor becomes less sensitive to the error. The output 
SNR of the ESP in the presence of a 1° error is about 4 dB, in comparison to about 10 dB 
of the PIC processor. The output SNR of lOdB is the level achievable by the two processors 
in the absence of the error. 

For a given uncorrelated noise level, the output SNRs of the two processors in the 
absence of errors can be increased by increasing the number of elements, as shown in 
Figure 2.27, where the number of elements of the linear array is increased from 10 to 20. 
Comparing Figure 2.27 with Figure 2.26, an increase of about 3 dB in the output SNRs of 
the two processors in the absence of SDE is noticeable. One also observes from the two 
figures that the ESP is more sensitive to the SDE in the presence of an array with a greater 
number of elements. With an array of 20 elements, the output SNR of the ESP in the 
presence of 1° SDE is about -4 dB, in comparison to 4 dB when the number of elements 
in the array is ten. 

All the above results are for a linear array. Similar results were reported in [God87] 
when a planar array was used. 

The above results show that in the absence of errors both processors produce identical 
results, whereas in the presence of look direction errors the beam space processor produces 
superior performance. The situation arises when the known direction of the signal is 
different from the actual direction. Now let us look at the reason for this difference in the 
performance of the two processors. 

The weights of the processor are constrained with the known look direction. When the 
actual signal direction is different from the one used to constrain weights, the ESP cancels 
this signal as if it were interference close to the look direction. The beam space processor, 
on the other hand, is designed to have the main beam steered in the known look direction 
and the auxiliary beams are formed to have nulls in this direction. The response of the 
main beam does not alter much as one moves slightly away from the look direction, and 
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FIGURE 2.27 

Output SNR vs. the SDE for a 20-element linear array, = 100, a* = 1, 0, = 25°. (From Godara, L.C., IEEE Trans. 
Circuits Syst., 34, 721-730, 1987. ©IEEE. With permission.) 

thus the signal level in the main beam is not affected. Similarly, when a null of the auxiliary 
beams is placed in the known look direction, a very small amount of the signal leaks in 
the auxiliary beam due to a source very close to the null and thus the subtraction process 
does not affect the signal level in the main beam, yielding a very small signal cancelation 
in the beam space processing compared to the ESP. For details of the effect of other errors 
on the beam space processors, particularly GSC, see, for example [Jab86]. 

A comparison of the performance of the PIC with the tamed element space processor 
is presented in Figure 2.28 for the scenario of Figure 2.27. For the tamed array, as discussed 
in [Tak86], the weights of the optimal ESP are calculated using the array correlation matrix 
R x , given by 


R t =R + 0CqI (2.6.104) 

where is a control variable. The performance of the tamed array is optimized for 

- I p s 2 (2.6.105) 

Figure 2.27 and Figure 2.28 show that the performance of ESP in the presence of SDE 
has improved substantially using this procedure. However, the PIC performs better than 
the tamed ESP. 


2.7 Effect of Errors 

The optimal weights of an antenna array, computed using the steering vector in the 
direction of arrival of the desired signal and the noise-only array correlation matrix or the 
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FIGURE 2.28 

Output SNR vs. the SDE for a 20-element linear array, p, = 100, = 1, 0j = 25°. (From Godara, L.C., IEEE Trans. 

Circuits Si/sf., 34, 721-730, 1987. ©IEEE. With permission.) 


total array correlation matrix, maximizes the output SNR in the absence of errors. In 
practice, the estimated optimal weights are corrupted by random errors that arise due to 
imperfect knowledge of the array correlation matrix, errors in steering vector estimation 
caused by imperfect knowledge of the array element positions, and error due the finite 
word-length arithmetic used, and so on. Thus, it is important to know how these errors 
degrade array performance. The effect of some of these errors on the performance of the 
optimal processor is discussed in the following sections. 


2.7.1 Weight Vector Errors 

Array weights are calculated using ideal conditions and then stored in memory, and are 
implemented using amplifiers and phase shifters. Theoretical study of system performance 
assumes the ideal error-free weights, whereas the actual performance of the system is 
dependent on the implemented weights. The amplitude as well as the phase of these 
weights are different from the ideal ones, and these differences arise from many types of 
errors caused at various points in the system, starting from the deviation in the assumption 
that a plane wave arrives at the array, uncertainty in the positions and the characteristics 
of array elements, error in the knowledge of the array correlation matrix caused by its 
estimation from finite number of samples, error in the steering vector or the reference 
signal used to calculate weights, computational error caused by finite precision arithmetic, 
quantization error in converting the analog weights into digital form for storage, and 
implementation error caused by component variation. Studies of weight errors have been 
conducted in which these errors are modeled as random fluctuations in weights [God86, 
Lan85, Ber77, Hud77, Nit76, Ard88], or by modeling them as errors in amplitude and 
phase [Ram80, Far83, Qua82, Kle80, Cox88, DiC78]. Performance indices to measure the 
effect of errors include the array gain [God86, Far83], reduction in null depth [Lan85], 
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reduction in interference rejection capability [Nit76], change in side-lobe level [Ram80, 
Qua82, Kle80], and bias in the angle of arrival estimation [Cox 88], and so on. 

The array gain is the ratio of the output SNR to the input SNR. The effect of random 
weight fluctuation is to cause reduction in the array gain. The effect is sensitive to the 
number of elements in the array and the array gain of the error-free system [God86]. For 
an array with a large number of elements and with a large error-free gain, a large weight 
fluctuation could reduce its array gain to unity, which implies that output SNR becomes 
equal to the input SNR and no array gain is obtainable. 

In this section, the effects of random errors in the weights of the processors on the output 
signal power, output noise power, output SNR, and array gain are analyzed [God86]. It 
is assumed that the estimated weights are different from the optimal weights by additive 
random noise components. Let these errors be represented by an L-dimensional vector I’ 
with the following statistics: 


E(r) = 0 i = l, 2,..., L 


Err = 


1 = l 

i*j 


i, j = 1, 2,..., L 


(2.7.1) 


Let an L-dimensional complex vector w represent the estimated weights of the processor. 
Thus, 


w = 


R -1 S 

1V N J 0 | p 


s h r -1 s 


(2.7.2) 


2.7.7./ Output Signal Power 

The output signal power of the processor with estimated weights w~is given by 

P s (w) = p s w H S 0 S»w (2.7.3) 

Substituting for w"and taking the mean value on both sides, this becomes, after manipulation, 

P s =Ps( 1 + a w L ) (2-7-4) 

Thus, the output signal power increases due to the random errors in the weight vector. 
This increase is proportional to the input signal power, variance of errors, and number of 
elements in the array. 

2.7.1.2 Output Noise Power 

The output noise power of the processor with estimated weights is given by 


P n (w)=w h R n w 


(2.7.5) 


Substituting for w, taking the expected value on both sides, and recognizing the fact that 
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(2.7.6) 


E[r H R N r] = E[Tr(rr H R N )] 
= Tr(E[rr H ]R N ) 

= ^wTr(R N ) 

= <Pn L 

after manipulation, the result is 


P N= P N +a w L P N 


= P* 


= P, 


2t PN 


l + o L 


.[l + ^wLG 


(2.7.7) 


where Tr[.] denotes the trace of [.] and p N is the total input noise power that includes 
directional interferences as well as uncorrelated noise. 

Thus, the output noise power increases due to the presence of random errors in the 
weights of the processor. The increase is proportional to the error variance, number of 
elements in the array, and total input noise power of the processor. 


2.7.1.3 Output SNR and Array Gain 

Let a w and G w denote the output SNR and the array gain of the processor with the random 
errors in the weights. It follows from (2.7.4) and (2.7.7) that 


a = a 


( 1 + a w L ) 

1 + g 2 LG 


(2.7.8) 


where a is the output SNR of the error-free beamformer. Equation (2.7.8) describes the 
behavior of the output SNR as a function of the variance of the random errors, number 
of elements in the array, output SNR, and array gain of the optimal processor. 

Dividing both sides of this expression by the input SNR leads to an expression for G lv , 
that is, 

G = G 1 + G ' vL (2.7.9) 

w 2 t v ' 

i+°;lg 

From this expression the following observations can be made: 

1. The array gain G w of the processor with the random additive errors in the weights 
is a monotonically decreasing function of the variance of the random errors. 

2. In the absence of errors in the weights, G w is equal to G, the array gain of the 
optimal processor. 

3. As o 2 increases very high G w approaches unity. Thus, for finite variance in the 
random errors the output SNR is more than the input SNR. 
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An analysis similar to that presented here shows that in the presence of weight vector 
error (WVE), the expressions for the output signal power and output noise power of the 
SPNMI processor are the same as those of the NAMI processor. Hence, the presence of 
the signal component in the array correlation matrix, which is used to estimate the optimal 
weights, has not affected the performance of the processor in the presence of WVE. 
However, as shown in the next section, this is not the case for steering vector error (SVE). 


2.7.2 Steering Vector Errors 

The known look direction appears in the optimal weight calculation through the steering 
vector. The optimal weight calculation for the constrained beamforming requires knowledge 
of the array correlation matrix and the steering vector in the look direction. Thus, the pointing 
error causes an error to occur in the steering vector, which is used for weight calculation. 

The steering vector may also be erroneous due to other factors such as imperfect knowl¬ 
edge of array element positions, errors caused by finite word-length arithmetic, and so 
on. The effect of steering vectors has been reported in [God86, Muc81, Com82]. An 
analytical study by modeling the error as an additive random error indicates [God86] that 
the effect of error is severe in the SPNMI processor, that is, when the array correlation 
matrix, which is used to estimate the weights, contains the signal. 

As the signal power increases, the performance of the processor deteriorates further due 
to errors. By estimating the weights using a combination of a reference signal and a steering 
vector, sensitivity of a processor to the SVE may be reduced [Hon87]. 

In this section, the effect of SVE on optimal beamformer performance is considered 
[God86]. It is assumed that each component of the estimated steering vector S is different 
from S 0 by an additive error component, that is. 


S = S 0 + r s (2.7.10) 

where 

E[r si ] = 0 i = l, 2,..., L (2.7.11) 

and 


E|r Si r Sj | = 


1 \ i, j = 1, 2,..., L 

0 i*j 


(2.7.12) 


The analysis presented here is for processors without constraints. The NAMI processor is 
first considered. 


2.7.2.1 Noise-Alone Matrix Inverse Processor 

Let an L-dimensional vector w represent the estimated weights of the processor when S 
rather than S 0 is used in estimating the optimal weights. The expression for the estimated 
weights of the processor in this case becomes 


w = p R- J S 
= £ Rn(®0 + ^''s) 


(2.7.13) 


where p. is a constant. 
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The expected value of the mean output signal power and the mean output noise power 
are given below. The expectation is taken over the randomness in the steering vectors. 


2.7.2.1.1 Output Signal Power 

The output signal power of a processor with weights iv is given by 

P s (W) = p s W H S 0 S> (2.7.14) 

Substituting for w from (2.7.13), the signal power becomes 

PsW = Ps[hR N 1 ( S o+ r s)]"s o S; , [pR- 1 (S 0 + r s )] 

= p s p. S 0 R n S 0 S 0 R n S 0 

(2.7.15) 

+ p s iilsXWs+itRNVtt] 

+ PsP 2 S«R N 1 r s r «R N 1 S 0 


Taking the expected value on both sides of (2.7.15) and using (2.7.11) and (2.7.12), after 
rearrangements. 


p s Ps P 


(S?R N 1 S 0 ) 2 +a s 2 S«R N 1 R N 1 S 0 


= Ps 


c h t? _1 R _1 S 

1+o r 0NN 7 

C> / _ TU_ 1 _ \ Z 


(s» 0 ) 


P 2 (s«r n 1 s 0 ) 2 


(2.7.16) 


= Ps 


1 +K 2 P 2 / p N 


where p is the ratio of uncorrelated noise power at the output to the uncorrelated noise 
power at the input of the optimal processor and P N is the mean output noise power of 
the optimal processor. 

2.7.2.1.2 Total Output Noise Power 

The output noise power of the processor with weight vector w is given by 


P n = W h R n w (2.7.17) 

Substituting for w from (2.7.13), it becomes 

P n = P s 0 R n S 0 

+ P 2 [s»Rp 1 r s + rHR- 1 S 0 ] (2.7.18) 

+p 2 r^'R N 1 r s 
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Taking the expected value on both sides of (2.7.18) and using (2.7.11) and (2.7.12), after 
rearrangements. 


where 


PN=p 2 [sX 1 So+^ Tr ( R N 1 )] 
Tr(R N ') 




1+0* tt , 

s s h r _1 s 


S h R _1 S 


2^1 


= [l + a 2 K 


(2.7.19) 


T 1 R ~) 

“(sX's.) 


(2.7.20) 


Since k > 0, it follows from (2.7.19) that the output noise power increases in proportion 
to the variance of the random errors in the steering vector. 

2.7.2.1.3 Output SNR and Array Gain 

Let a s and G s denote the output SNR and the array gain of the NAMI processor with SVE. 
It follows then from (2.7.16) and (2.7.19) that 


and 


oc s = a 


XP 

l + a s K 


G = G 


XP 

1 + ctk 


(2.7.21) 


(2.7.22) 


It follows from these two equations that the behavior of the output SNR and the array 
gain of the NAMI processor with SVE depend on the relative magnitudes of (3 and k. It 
can be shown that k > p, and thus the array gain of the NAMI processor with the random 
errors in the steering vector is a monotonically decreasing function of the error variance. 


2.7.2.2 Signal-Plus-Noise Matrix Inverse Processor 

Let an L-dimensional vector w represent the estimated weights of the SPNMI processor 
when S rather than S 0 is used in estimating the optimal weights. The expression for w in 
this case becomes 


^ = (iR' 1 [S 0 +r s ] (2.7.23) 

where 

R = R n+Ps S 0 S» (2.7.24) 
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Using the Matrix Inversion Lemma, 


R-^R^-aoR-VX 1 


(2.7.25) 


where 


Ps 

l + p s So r n s 0 


(2.7.26) 


From (2.7.23) and (2.7.25), it follows that 


w = P R n '[s 0 + r s ] - ji a 0 R-^s 0 sX (S 0 + r s) 


(2.7.27) 


Comparing (2.7.13) with (2.7.27) one notes that the second term in (2.7.27) is due to the 
presence of the signal component in the array correlation matrix that is used in estimating 
the optimal weights. As the signal component goes to zero, the second term goes to zero 
because a 0 goes to zero, and thus w becomes w. 

The effect of SVE on the output signal power, the output noise power, the output SNR, 
and the array gain is now examined. 

2.7.2.2.1 Output Signal Power 

Following a procedure similar to that used for the NAMI processor, an expression for the 
mean output signal power of the SPNMI processor in the presence of the SVE becomes 




Ps 


1 + 




(2.7.28) 


Comparing (2.7.16) with (2.7.28), in the presence of SVE the output signal power of both 
processors increases and the increase is proportional to the output signal power of the 
respective error-free processor and the parameter p, which is the ratio of the uncorrelated 
noise powers at the output of the optimal processor to its input. Hence, the effect of the 
random SVE on both processors is the same. Thus, the presence of the signal component 
in the array correlation matrix has not altered the effects of SVE on output signal power. 
In the next section, it is shown that this is not the case for the output noise power. 

2.7.2.2.2 Total Output Noise Power 

Following a procedure similar to that used for the NAMI processor, an expression for the 
mean output noise power of the processor with weight vector w becomes 


Pn = 


) 2 

b 


+ Ps) 


_p 

2 1 N 


1 + er¬ 


ic + a 


(& 2 + 2a)(K-pj 


(2.7.29) 


Since k > jj, it follows from (2.7.29) that the output noise power of the SPNMI processor 
increases with the increase in the variance of random errors a 2 , and the increase is 
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enhanced by the input signal power due to the presence of product terms c> 2 6? and o fi. 
Note that the third term in (2.7.29), which contains these terms, is missing from the 
expression of the output noise power given by (2.7.19) when the array correlation matrix 
of noise only is used in the calculation of the optimal weights. 

2.7.2.23 Output SNR 

Let oc s denote the output SNR of the SPNMI processor in the presence of SVE. Then it 
follows that 


& 

(l + °sP) 



K + 

[a 2 + 2d) 

Ml] 


(2.7.30) 


which describes the behavior of the output SNR of the SPNMI processor in the presence 
of random SVE. Comparing this with (2.7.21), the expression for the output SNR of the 
NAMI processor, one observes the presence of a 2 and a in the denominator of (2.7.30). 
As the output SNR of the optimal processor a is directly proportional to the input SNR 
of the processor, it follows that: 

1. The effect of SVE on output SNR of the SPNMI processor is very sensitive to the 
input signal power. 

2. The output SNR of the SPNMI processor drops faster than the output SNR of the 
NAMI processor as the error variance increases. 

3. For a given level of SVE, the output SNR of the SPNMI processor is less than the 
output SNR of the NAMI processor, and the difference increases as the power of 
the signal source increases. 

It should be noted here that the above observations are true for any array geometry and 
noise environment. However, the array geometry and the noise environment would affect 
the results as a, k, and p depend on them. 

Now the array gain of the SPNMI processor G s is compared with the array gain of the 
NAMI processor G s in the presence of SVE. For this case, array gain is given by 



(l + °sP) 

G 

l+°s 2 

k + (a 2 + 2d) 

Ml] 


(2.7.31) 


Since k > p, it follows from (2.7.22) and (2.7.31) that for a given c? 2 , the array gain 6 S 
of the SPNMI processor is less than the array gain G s of the NAMI processor, and G s falls 
more rapidly than G s as the variance of the random SVE increases. The fall in G s is greater 
at a higher input SNR than at a lower input SNR. 


2.7.2.3 Discussion and Comments 

Table 2.2 compares the various results on SVE and WVE. All quantities are normalized 
with their respective error-free values to facilitate observation of the effect of errors. The 
following observations can be made from the table: 
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TABLE 2.2 

Comparison of the SVE and WVE* 



Normalized Mean 
Output Signal Power 

Normalized Mean 
Output Noise Power 

Normalized Array Gain 

Effect of SVE 




on NAMI 

i+k 2 

l + o. 2 * 

i+t 2 p 

-i . 2 

processor 



1 + <K 

Effect of SVE 




on SPAMI 


l + o 2 |^K + (d 2 + 2d)|K-pjJ 

(i+<t 2 P) 

processor 

i+k 2 

l + o 2 |^K + (d 2 + 2d)(K-p)j 

Effect of WVE 



(l + o 2 L) 

on both 

l+o 2 L 

1 + cy^LG 

V w / 

processors 



l + o 2 LG 


* (3: Ratio of the uncorrelated noise at the output to the input of the optimal beamformer; G: array 
gain of the optimal beamformer; a: output SNR of the optimal beamformer; o 2 : variance of the 
additive random steering vector errors; variance of the additive random weight vector errors; 



1. The output signal power in all cases increases with the increase in error variance. 
For WVE case, the increase depends only on the number of elements, whereas for 
SVE it depends on the array geometry and the noise environment. 

2. The output noise power in all cases increases with the increase in error variance. 
For the WVE case, the increase depends on G and is independent of signal power. 
For the SVE case, the increase in the output noise power is dependent on the input 
signal power for the SPNMI processor, and is independent of the signal power 
for the NAMI processor. 

3. The array gain in all cases decreases with the increase in the error variance. In the 
case of WVE, the decrease in the array gain depends on G. The greater G is, the 
faster the array gain drops as the error variance increases. In the SVE case, the 
array gain of the SPNMI processor is dependent on the output SNR of the optimal 
processor (a), and it drops as a is increased. Note that a is directly proportional 
to the input signal power. The effect of SVE on the NAMI processor is not affected 
by the input signal power. 

Two special cases of the noise environment are considered below to study the effect of 
array elements, uncorrelated noise power, direction, and power of the interference source. 

2.7.2.3.1 Special Case 1: Uncorrelated Noise Only 

Consider the case of a noise environment where only uncorrelated noise is present. Let A 
denote the ratio of the input signal power to the uncorrelated noise power on each element. 
For this case. 


G = L (2.7.32) 

p = i (2.7.33) 


a = LA 


(2.7.34) 


© 2004 by CRC Press LLC 



TABLE 2.3 

Comparison of Array Gain in the Presence of SVE and WVE with 
No Interference Present* 


Array gain NAMI processor in SVE 


Array gain of SPAMI processor in SVE 


Array gain of both processors in WVE 


L + ° 2 
l + o 2 


L+cr 


1 + cr 2 [l + LA 2 (L -1) + 2A(L -1)] 

L + <L 2 
1 + o 2 L 2 


* a 2 : Variance of steering vector error; o 2 : variance of weight vector error, 
A = Ps/ a n- 


and 


K = 1 


(2.7.35) 


The expressions for the array gains of the two processors in the presence of SVE and WVE 
are shown in Table 2.3. From the table, the following observations can be made. 


1. For a given error level, say o s = o s> the array gain of the NAMI processor increases 
as the number of elements in the array increases. Thus, for a given error level and 
input SNR, the output SNR of the NAMI processor increases as L increases. 

2. The array gain of the NAMI processor decreases as the error level is increased, 
and it does not depend on the ratio of the input signal to the uncorrelated noise 
power, A. However, the behavior of the array gain G s of the SPNMI processor in 
the presence of SVE depends on A. For a given L, 6 S drops faster at a higher A than 
at a lower A as the SVE level is increased. 

3. For A <§ 1, the expression for G s becomes 



L + o; 
l + o 2 


(2.7.36) 


and for a given level of errors the array gain increases with the increase in the 
number of elements, as in the case of the NAMI processor. 

4. For A S> 1, the expression for G s becomes 


r L + a s 2 

s 1 + a 2 A 2 L(L -1) 


(2.7.37) 


Thus, for a given o & the array gain decreases with the increase in the number of 
elements for a very high input signal to uncorrelated noise ratio. 

5. The plots of G s vs. the input SNR for various values of L are shown in Figure 2.29 
for error variance equal to 0.01. The results displayed in the figure are in agreement 
with the above observations. 

6. A comparison of the expressions for the array gain in the presence of the SVE and 
the WVE reveal that G w the array gain of both processors in the presence of WVE, 
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FIGURE 2.29 

Array gain of SPNMI processor vs. SNR, no interference, and a* = 0.01. (From Godara, L.C., IEEE Trans. Aerosp. 
Electron. Syst., 22, 395^09, 1986. ©IEEE. With permission.) 


behaves similarly to G & the array gain of NAMI processor in the presence of SVE. 
For a given error level, both G w and G s increase with the increase in L. However, 
for the same error level, say o s = o w = o 0 . 


G s -G 


q,;(L-l)(l?-l) 
( 1 + a o)( 1 + 0 o L2 ) 


(2.7.38) 


2.7.23.2 Special Case 2: One Directional Interference 

Consider the case of a noise environment consisting of a directional interference of power 
Pj and uncorrelated noise of power o^ on each element of the array. For this case, G and a 
are, respectively, given by (2.4.43) and (2.4.42), 


r_1 , PL(I-P) 
L (pL + e 0 ) 2 


, L(l - p) -1 
pL + e 0 

where 


e 


o 


G 


2 

n 


Pi 


(2.7.39) 


(2.7.40) 


(2.7.41) 
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WVE VARIANCE 
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FIGURE 2.30 

G w vs. 0 ^, 0^ /pj = 0 dB, and ten-element array. (From Godara, L.C., IEEE Trans. Aerosp. Electron. Syst., 22, 
395^09, 1986. ©IEEE. With permission.) 

The effect of a variation in p on the array gain of the two processors in the presence of 
WVE and SVE is shown in Figure 2.30 to Figure 2.33. The number of elements in the array 
for these figures is taken to be ten. 

Figure 2.30 shows G w vs. oj, for five values of p. One observes from the figure that G w 
which denotes the array gain of both the processors in the WVE, decreases faster at higher 
values of p than at lower values of p, as the variance of the errors is increased. The result 
is expected, since G increases as p increases. 

Figure 2.31 and Figure 2.32 show the effect of p on the array gain of the SPNMI processor 
in the presence of SVE for o l/p l = 0 dB and a ~/p : = -40 dB, respectively. These figures 
show that as the error variance is increased, the array gain falls more rapidly at higher 
values of p than at lower values of p. The result is expected, since a increases as p increases. 

A comparison of Figure 2.31 and Figure 2.32 reveals that the effect of SVE on the array 
gain is not altered significantly by increasing the interference power. The result is predict¬ 
able from the expression for G s , since for Lp S> o^/p, the constants (3, k, and a are 
independent of interference power. 

The effect of p on the array gain of the NAMI processor in the presence of the SVE is shown 
in Figure 2.33. The figure demonstrates that the effect of the SVE on the array gain of the 
NAMI processor is almost the same for all values of p. This observation implies that the array 
geometry and direction of interference do not significantly influence the effect of SVE on the 
NAMI processor unless the interference direction is very close to the look direction. 

Figure 2.34 and Figure 2.35 compare the three array gains G w , Gg, and G s for the case 
of weak interference o^/pj = 0 dB, and strong interference, dB. For these figures, input 
signal power is equal to uncorrelated noise power. These figures show that the array gains 
of both processors in the presence of the SVE are not affected by the interference power, 
whereas G vv , the array gain of two processors in the presence of WVE, is highly dependent 
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FIGURE 2.31 

G s vs. a s 2 , a 2 /R =0 dB, and ten-element array. (From Godara, L.C., IEEE Trans. Aerosp. Electron. Syst., 22,395-409, 
1986. ©IEEE. With permission.) 



FIGURE 2.32 

G s vs. a 2 , a 2 /p[ = —40 dB, and ten-element array. (From Godara, L.C., IEEE Trans. Aerosp. Electron. Syst., 22, 
395^09, 1986. ©IEEE. With permission.) 
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FIGURE 2.33 

G s vs. a 2 , o 2 /p>[ = —40 dB, and ten-element array. (From Godara, L.C., IEEE Trans. Aerosp. Electron. Syst., 22, 
395^09, 1986. ©IEEE. With permission.) 


on the interference power. It drops faster as is increased in the presence of the inter¬ 
ference and the rate of drop increases with the increase in the interference power. Note 
the difference in the vertical scales of the two figures. 


2.7.3 Phase Shifter Errors 

The phase of the array weight is an important parameter and an error in the phase may 
cause an estimate of the source to appear in a wrong direction when an array is used for 
finding directions of sources, such as in [Cox88]. The phase control of signals is used to 
steer the main beam of the array in desired positions, as in electronic steering. A device 
normally used for this purpose is a phase shifter. Commonly available types are ferrite 
phase shifters and diode phase shifters [Mai82, Sta70]. One of the specifications that 
concerns an array designer is the root mean square (RMS) phase error. 

Analysis of the RMS phase error shows that it causes the output SNR of the constrained 
optimal processor to suppress the desired signal, and the suppression is proportional to 
the product of the signal power and the random error variance [God85]. Furthermore, 
suppression is maximum in the absence of directional interferences. Quantization error 
occurs in digital phase shifters. In a p-bit digital phase shifter, the minimum value of the 
phase that can be changed equals 2n/2P. Assuming that the error is distributed uniformly 
between 7t/2P to n/ 2P, the variance of this error equals % 2 /3 x 2 2 P. 

In this section, the effect of random phase errors on the performance of the optimal 
processor is analyzed [God85]. To facilitate this analysis, the phase shifters are separated 
from the weights as shown in Figure 2.36 and are selected to steer the array in the look 
direction. 
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FIGURE 2.34 

G w/ G s , and G s vs. o 2 , a 2 /p[_ = -40 dB, p = 0.9, and ten-element array. (From Godara, L.C., IEEE Trans. Aerosp. 
Electron. Syst., 22, 395^09, 1986. ©IEEE. With permission.) 



FIGURE 2.35 

G w , G s , and G s vs. a 2 , o 2 /p, = 0 dB, p = 0.9, and ten-element array. (From Godara, L.C., IEEE Trans. Aerosp. 
Electron. Syst., 22, 395^09, 1986. ©IEEE. With permission.) 
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FIGURE 2.36 

Beamformer structure showing phase shifters. 


Let the optimal weights of the beamformer of Figure 2.36, referred to as the beamformer 
using phase shifters, be denoted by w. It follows from the figure that the output of the 
optimal beamformer using phase shifters is given by 


y(t) = wV(t) (2.7.42) 

where x'(t) is the array signal received after the phase shifters and are given by (2.6.31). 
Thus, using (2.6.31), (2.7.42) becomes 

y(t)=v^ H <I>»x(t) (2.7.43) 

Now a relationship between w and w, the weights of the optimal beamformer without 
using phase shifters discussed in Section 2.4, is established. The output of the optimal 
beamformer without using phase shifters is given by 

y(t)=w H x(t) (2.7.44) 

Since the outputs of both structures are identical, it follows from (2.7.43) and (2.7.44) 
that w and w are related as follows: 


w = <t> 0 W (2.7.45) 

An expression for w may be obtained from (2.4.16) and (2.7.45) and is given by 


( 

w = 


2 o N J o 

‘-’o JX N‘- , 0 


(2.7.46) 


2.7.3.1 Random Phase Errors 

In this section, the effect of random phase errors on optimal processor performance is 
examined. Phase shifters with random phase errors are termed "actual phase shifters," 
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and the processor in this case is termed "optimal processor with phase errors" (OPPE). It 
is assumed that random phase errors that exist in the phase shifters can be modeled as 
stationary processes of zero mean and equal variance and are not correlated with each 
other. 

Let 8„ 1 = 1, 2, ..L represent the phase error in the 1th phase shifter. By assumption. 


E[8j = 0, 1= 1, 2,..., L 


(2.7.47) 


and 


£[ 5 , 5 ,] 


jo 2 ifl=k, 

[O otherwise. 


1, k = l, 2,..., L 


(2.7.48) 


Let a„ 1 = 1, 2, ..., L represent the phase delays of the actual phase shifters. Then 


a 1 = a 1 + 8 1=1,2,..., L (2.7.49) 

where a, 1 = 1, 2, ..., L are the phase delays of error-free phase shifters, and are given by 
(2.6.29). 

Let a diagonal matrix <1> be defined as 

<h 11 =exp(jd 1 ), 1=1, 2,..., L (2.7.50) 

It follows from (2.7.43) that an expression for the mean output power of the optimal 
beamformer using phase shifters is given by 

P = E[y(t)y*(t)] 

= E[x(t)x H (t)]<I> 0 ^ (2.7.51) 

= w H T>]( J R<l> 0 w 

Similarly, the mean signal power, interference power, and uncorrelated noise power, 
respectively, are given by 

P s =><R s <& 0 i 

P,=i H T>HR^ 0 w 

and 

P =o 2 v/ H v/ 

n n 

= °n 2 P 

where p is defined by (2.4.11). The last step in (2.7.54) follows from using (2.7.46). 

Note that the mean output uncorrelated noise power given by (2.7.54) is not a function 
of phase angles and is not affected by the random errors in the phase shifters. 


(2.7.52) 

(2.7.53) 

(2.7.54) 
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The effect of random phase errors on the output signal power and output interference 
power is now examined. Substituting <t> for <t> 0 in (2.7.52) and (2.7.53) and taking expec¬ 
tation over random phase errors, expressions for the mean output signal power P s and 
interference power P ; of the OPPE follow: 


P s = w h e[4> h R s 4>]w 
= p s i H E[<I> Hs 0 s ^]i 


and 


P : = w H E[<t> H R,T>]w 


(2.7.55) 


(2.7.56) 


2.7.3.2 Signal Suppression 

Rewrite (2.7.55) in the following form: 

p s=Ps£^E[o;.s 01 s ( ; k o kk ]^ k 

l,k 

Substituting for and S 0 in (2.7.57), after rearrangement, 
p s = Ps£w^ k E[exp(-j(8 1 -8 k ))] 

l,k 

= Ps£^ 1 4 k E[exp-j(5 1 -5 k )]+p s ^i‘i k 


l,k l,k 

l*k l=k 


(2.7.57) 


(2.7.58) 


Using the expansion 


exp(z) = l + z + — + — + L 
’ 2! 3! 


the first term on the RHS of (2.7.58) becomes 


Ps^w;i k E[exp-j(5 1 -5 k )] 

l,k 

l*k 


L (* ( T 7 

w i w k E 

l*k 




{K - 5 i) 2 j(« k 


2 ! 


3! 


+ L 


(2.7.59) 


(2.7.60) 


Assuming that the contribution of the higher-order terms is negligibly small, using (2.7.47) 
and (2.7.48), (2.7.60) results in 
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(2.7.61) 


P s £ + j( 8 k - 8 i)] = PsY, ^Ai 1 - ° 2 ) 

l*k l*k 

= p s ( 1 -^E^A-p s (l-a 2 E^A 

l,k l=k 

= Psi 1 - o 2 )w H ll T w - p s (l - o 2 )p 

Noting that the second term on the RHS of (2.7.58) is p s p, and the fact that w H 1 = 1, (2.7.58) 
and (2.7.61) yield 

P s = Ps-Ps° 2 M) (Z7 - 62) 

Note that in the absence of directional interferences p = 1/L, (2.7.62) becomes 

A=Ps~Ps L f 1 ° 2 (2-7.63) 

Thus, the output signal power of OPPE is suppressed. The suppression of the output 
signal power is proportional to the input signal power and random error variance. In the 
presence of directional interference p increases and thus the reduction in the signal power 
is less than otherwise. In other words, signal suppression is maximum in the absence of 
directional interference, and is given by the second term on the RHS of (2.7.63). 

2.7.3.3 Residual Interference Power 

Rewrite (2.7.56) in the following form: 

( L 

(2.7.6 4 ) 

l,k 

Using (2.6.32), (2.7.49), and (2.7.50) in (2.7.64), 

p i = k A E [ ex p(j( 8 k - 8 i))] 

l,k 

L 

= £ Ak E [exp(j(5 k - §,))] (2.7.65) 

l=k 

+ ^ w> 0 * 11 R I]k T> 0kk w k E[exp(j(8 k - 5,))] 

l*k 

Noting that the diagonal entries of R, are the sum of all directional interference power 
p,, the first term in the RHS of (2.7.65) reduces to pjp. Following steps (2.7.59) to (2.7.61), 
the second term in the RHS of (2.7.65) becomes (1 - a 2 )[w H <^ H ]^<Jw -p P]. Thus, 
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(2.7.66) 



(l-o 2 ) w^R^w + a 2 p : p 


Substituting for w from (2.7.46) in (2.7.66), it follows that 

P r = P, + cr 2 (|3p I -P,) 


(2.7.67) 


where p, is the total power of all directional interferences at the input of the processor and Pj 
is the residual interference power of the optimal processor given by (2.4.18). 


2.7.3.4 Array Gain 

In this section, the effect of random phase errors on the array gain of OPPE is examined. 
Let SNR 0 be the output SNR of OPPE. Thus, 


where 


( P 
SNR„ = 


V V V 

p =p +p 


(2.7.68) 


(2.7.69) 


is the total mean output noise power of OPPE. 

Since the uncorrelated mean output noise power is not affected by the random phase 
errors, it follows from (2.7.54) that 


P =P 

n n 

= °nP 


(2.7.70) 


Substituting from (2.7.70) and (2.7.67) in (2.7.69), using (2.4.12) and (2.4.35), after manip¬ 
ulation. 


P =P 

1 N 1 N 


1 + 


o 2 (pG-l)' 


(2.7.71) 


where G is the array gain of the optimal processor. 
From (2.7.62), (2.7.68), and (2.7.71) it follows that 


( 

SNR 


o 


Ps 

P N 



(2.7.72) 


If G denotes the array gain of OPPE, then it follows from (2.7.72), (2.4.34), and (2.4.35) that 

( - l + o 2 (p-l) 

G = G- \ (2.7.73) 

l + o 2 G|3-l 
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TABLE 2.4 


Comparison of Steering Vector Errors and Phase Shifter Errors * 


Type of Error 

Phase-Shifter Error 

Steering-Vector Error 

Normalized output 
signal power 

l-<r(l-p) 

i + 0 2 p 

Normalized total 



output noise power 

l-c 2 (l-PGj 

1 + 0 2 k 

Normalized array gain 

l-o 2 (l-p) 

1 + 0 2 P 


l-o 2 |l-pGj 

1 + 0 2 k 


*p: Ratio of the uncorrelated noise at the output to the input of the 
optimal processor; G: array gain of the optimal processor; o 2 : variance 
of the additive random phase shifter errors; a 2 : variance of the additive 


random steering vector errors. K = —--'. 

(s» 0 ) 


Let 

d = d (2.7.74) 

\<5=a 1 

and 

G, = g| (2.7.75) 

la=c 2 

A simple algebraic manipulation using (2.7.73) to (2.7.75) shows that for o 2 > cq, 

G 2 < d (2.7.76) 

Thus, the array gain of the optimal processor with random phase errors is a monotonically 
decreasing function of the error variance. 

2.7.3.5 Comparison with SVE 

Now, a comparison between the effect of the random phase shifter errors and the effect 
of random SVE on optimal processor performance is made. SVE is discussed in Section 2.7.2. 

Table 2.4 compares results. For purposes of the comparison, the results in both cases are 
normalized with corresponding error-free values and thus are referred to as normalized. 
The mean output signal power decreases with the increase in variance of phase shifter 
error if (3 < 1, whereas in the case of SVE it is a monotonically increasing function of the 
variance of the errors. Note that for white noise only, (3 < 1 /L. The total mean output noise 
power is a monotonically increasing function of SVE variance, whereas it decreases with 
the increase in variance of the phase-shifter error if (3G < 1. The array gains in both the 
cases are monotonically decreasing functions of the variance of random errors. 


2.7.4 Phase Quantization Errors 

In this section, a special case of random error, namely the phase quantisation error, which 
arises in digital phase shifters, is considered. In a p-bit phase shifter, the minimum value 
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of phase that can be changed is 2n/2P. Thus, it is assumed that the error which exists in 
a p-bit digital phase shifter is uniformly distributed between -ji/2p and %/ 2P. 

For a uniformly distributed random variable x in the interval (-C, C), it can easily be 
verified that 



(2.7.77) 


Substituting for C = Jt/2P in (2.7.77), the variance a 2 of the error in a p-bit phase shifter, 
is given by 


o 


2 

P 


3.2 2p 


(2.7.78) 


Substituting o p for o in expressions for the mean output signal power, mean output 
noise power, output SNR, and the array gain, the following expressions for these quantities 
are obtained as a function of the variance of the phase quantization error: 
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(2.7.81) 


(2.7.82) 


2.7.5 Other Errors 

Uncertainty about the position of an array element causes degradation in the array per¬ 
formance in general [Gof87, Kea80, She87, Ram80, Gil55], and particularly when the array 
beam pattern is determined by constrained beamforming. As discussed previously, ele¬ 
ment position uncertainty causes SVE, which in turn leads to a lower array gain. The effect 
of position uncertainty on the beam pattern is to create a background beam pattern similar 
to that of a single element, in addition to the normal pattern of the array [Gil55]. A general 
discussion on the effect of various errors on the array pattern is provided in [Ste76]. 

A calibration process is normally used to determine the position of an antenna element 
in an array. It requires auxiliary sources in known locations [Dor80]. A procedure that 
does not require the location of these sources is described in [Roc87, Roc87a]. 

The element failure tends to cause an increase in side levels and the weights estimated 
for the full array no longer remain optimal [She87]. This requires recalculation of the 
optimal weight with the known failed elements taken into account [She87, Ram80]. 
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The effect of perturbation in the medium, which causes the wave front to deviate from 
the plane wave propagation assumption, and related topics are found in [Vur79, Hin80, 
Ste82]. The effect of a finite number of samples used in weight estimation is considered 
in [Bor80, Ber86, Rag92] and how bandwidth affects narrowband beamformer performance 
are discussed in [God86a, May79]. Effects of amplitude and phase errors on a mobile 
satellite communication system using a spherical array employing digital beamforming 
has also been studied [Chu92]. 


2.7.6 Robust Beamforming 

The perturbation of many array parameters from ideal conditions under which the theo¬ 
retical system performance is predicted, causes degradation in system performance by 
reducing the array gain and altering the beam pattern. Various schemes have been pro¬ 
posed to overcome these problems and to enhance array system performance operating 
under nonideal conditions [God87, Cox87, Eva82, Kim92, You93, Er85, Er93, Er93a, Er94, 
Tak86]. Many of these schemes impose various kinds of constraints on the beam pattern 
to alleviate the problem caused by parameter perturbation. A survey of robust signal¬ 
processing techniques in general is conducted in [Kas85]. It contains an excellent reference 
list and discusses various issues concerning robustness. 
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Notation and Abbreviations 

E[.] Expectation operator 

1 vector of ones 

(.) H Hermitian transposition of vector or matrix (.) 

(,) T Transposition of vector or matrix (.) 

A Matrix of steering vectors 

B Matrix prefilter 

c Speed of propagation 

CIB conventional interference beamformer 

d element spacing 

ESP element space processor 

f 0 carrier frequency 

G array gain of optimal beamformer 
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G s 

G s 

6 

IIB 

L 

MSE 

MMSE 

M 

m k(t) 

m s(t) 

m l(t) 

NAMI 

n i(t) 

n(t) 

OIB 

OPPE 

PIC 

Pk 

Pi 

Ps 

Pn 

P(w) 

P 

P- 

P(w) 

P(w) 

Ps 

% 

Ps 

Ps 

Ps 

p s 

P s (w) 

P s (w) 

Pi 

$ 

Pi 

Pi(w) 


array gain of optimal beamformer with WVE 

array gain of NAMI beamformer with SVE 

array gain of SPNMI processor with SVE 

array gain of OPPE 

improved interference beamformer 

number of elements in array 

mean square error 

minimum mean square error 

number of directional sources 

complex modulating function of kth source 

complex modulating function of signal source 

complex modulating function of interference source 

noise alone matrix inverse 

random noise on 1th antenna 

signal vector due to random noise 

orthogonal interference beamformer 

optimal processor with phase errors 

postbeamformer interference canceler 

power of kth source 

power of interference source 

power of signal source 

total noise at input 

mean output power for given w 

mean output power of optimal beamformer 

mean output power of optimal beamformer when known look direction is 
in error 

mean output power of optimal beam-space processor 
mean output power of optimal PIC processor 
mean output signal power 

mean output signal power of optimal beamformer 
mean output signal power of optimal beamformer in presence of WVE 
mean output signal power of NAMI processor in presence of SVE 
mean output signal power of SPNMI processor in presence of SVE 
mean output signal power of OPPE 

mean output signal power of beam-space processor for given w 
mean output signal power of optimal PIC processor with weight w 
mean output interference power 

mean output interference power of optimal beamformer 
mean output interference power of OPPE 

mean output interference power of optimal PIC processor with weight w 
mean output uncorrelated noise power 
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Pn 

V 

Pn 

Pn(w) 

Pn 

Pn 

Pn 

Pn 

Pn 

^n 

Pn( w ) 

Pn(w) 

Po 

Q 

q(t) 

q(t) 

r(t) 

r i 

R 

Pn 

Rn 

R I 

RMS 

R s 

R x 

Rqq 

R 

R 

SPNMI 

SNR 

SNR(w) 

SNR(w) 

SNR 0 

SVE 

s(t) 

s(t-T) 

S 

S k 

So 

S 0 


mean output uncorrelated noise power of optimal beamformer 
mean output uncorrelated noise power of OPPE 

mean output uncorrelated noise power of optimal PIC processor with 
weight w 

mean output noise power 

mean output noise power of optimal beamformer 
mean output noise power of optimal processor in presence of WVE 
mean output noise power of NAMI processor in presence of SVE 
mean output noise power of SPNMI processor in presence of SVE 
mean output noise power of OPPE 

mean output noise power of beam-space processor for given w 

mean output noise power of optimal PIC processor with weight w 

mean power of main beam 

matrix of eigenvectors 

outputs of M - 1 auxiliary beams 

output of interference beam 

reference signal 

position vector of 1th antenna 

array correlation matrix 

noise-only array correlation matrix 

random noise-only array correlation matrix 

interference-only array correlation matrix 

root mean square 

signal-only array correlation matrix 
array correlation matrix used in tamed array 
correlation matrix of auxiliary beams 
array correlation matrix after steering delays 

actual array correlation matrix when known look direction is in error 
signal-plus-noise matrix inverse 
signal-to-noise ratio 

signal-to-noise ratio of optimal beam-space processor 

signal-to-noise ratio of optimal PIC processor with weight w 

SNR of OPPE 

steering vector error 

signal induces on reference element 

signal delayed by T 

source correlation matrix 

steering vector associated with kth source 

steering vector associated with known look direction 

steering vector associated with actual look direction when known look di¬ 
rection is in error 
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s 

Si 

S(4>,0) 

T 

Tr[] 

U 

U 0 

U, 

V 

v (<t>k, 6k) 
WVE 

W 

w c 

w c 

w 1 

w 

w 

w 

w 

w 

w 

w 

W MSE 

Xl(t) 

x(t) 

x'(t) 

x s (t) 

Xl(t) 

y(t) 

ys(t) 

yi(t) 

y n (t) 

y(4>,e) 

z 

Z 

«i 

«1 


steering vector associated with known look direction in presence of SVE 

steering vector associated with interference 

steering vector associated with direction (( 11 , 0 ) 

delay time 

Trace of [•] 

weight vector of interference beam of PIC 

weight vector of interference beam of PIC with OIB 

eigenvector associated with 1th eigenvalue 

main beam weight vector 

unit vector in direction 

weight vector error 

weight of optimal PIC 

weight of optimal PIC using CIB 

weight of optimal PIC using IIB 

weight of optimal PIC using OIB 

weight of optimal PIC using OIB when known look direction is in error 
weights of conventional beamformer 
weight on 1th antenna 
weight vector 

weights of optimal beamformer 

weights of optimal beamformer when known look direction is in error 

weights of optimal beamformer in presence of weight errors 

weights of NAMI processor in presence of SVE 

weights of SPNMI processor in presence of SVE 

weights of optimal beamformer using phase shifters to steer array 

optimal weights of beamformer using reference signal 

signal induced on 1th antenna 

element signal vector 

element signal vector after presteering delay 
element signal vector due to desired signal source 
element signal vector due to interference source 
array output 

signal component in array output 

interference component in array output 

random noise component in array output 

response of a beamformer in (( 11 , 0 ) 

correlation between reference signal and x(t) 

correlation between outputs of auxiliary beams and main beam 

phase delays on 1th channel to steer array in look direction, phase delays of 

error-free phase shifter on 1th channel 

phase delays of actual phase shifter (including error) on 1th channel 
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a 

a w 

a s 

> 

a s 

«o 

a i 

P 

P 

Pp 

Po 

Yo 

5, 

e(t) 

e o 

4(w) 

I 

K 

¥(t) 

n(t) 

bo 

p 

r 

r s 

A 


<?s 

a 2 

°p 

(<Mk) 

(4»c>0o) 

(9k) 

9i 

9s 

h(<Mk) 

x i(0k) 

$0 

$ 


SNR of optimal beamformer 

output SNR of optimal beamformer with WVE 

output SNR of NAMI processor with SVE 

output SNR of SPNMI processor with SVE 

control variable used in tamed arrays 

SNR at output of interference beam of PIC processor 

ratio of uncorrelated noise power at out of optimal beamformer to input 
uncorrelated noise power 

normalized dot product of S 0 and S, 
phase of parameter p 
Euclidian norm of U 0 

normalized power response of interference beam in interference direction 
phase error in 1th phase shifter 
error signal 

ratio of uncorrelated noise to interference power at input of beamformer 
MSE for given w 
minimum MSE 

scalar parameter defined by (2.7.20) 
output of main beam 
output of interference beam 
scalar constant 

scalar parameter function of array geometry, 0 S and 0, 

vector of random errors in weights 

vector of random errors in steering vectors 

diagonal matrix of eigenvalues 

1th eigenvalue of array correlation matrix 

power of random noise induced on element 

variance of weight vector errors 

variance of steering vector errors 

variance of phase shifter errors 

variance of phase error in p-bit phase shifter 

direction of kth source using three-dimensional notation 

look direction using three-dimensional notation 

direction of kth source using two-dimensional notation 

direction of interference source using two-dimensional notation 

direction of signal source using two-dimensional notation 

propagation delay on 1th antenna from source in (bk,9k) 

propagation delay on 1th antenna from source in (0J 

diagonal matrix of error-free phase delays 

diagonal matrix of phase delays (including errors) 
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The weights of an element-space antenna array processor that has a unity response in the 
look direction and maximizes the output SNR in the absence of errors are given by 


w = 


R _1 S 

1V N J 0 

gHr-'S 

3 0 1N -N a o 


(3.1) 


where R N is the array correlation matrix with no signal present, and is referred to as the 
noise-only array correlation matrix, and S 0 is the steering vector associated with the look 
direction. When the noise-only array correlation matrix is not available, the array correlation 
matrix R is used to calculate the optimal weights. For this case the expression becomes 


w = 


R -1 So 

s ?R- ls „ 


(3.2) 


The weights of the processor that minimizes the mean square error (MSE) between the 
array output and a reference signal are given by 


^MSE — R Z (3-3) 

where z denotes the correlation between the reference signal and the array signals vector x(t). 

In practice, neither the array correlation matrix nor the noise-alone matrix is available 
to calculate optimal weights of the array. Thus, the weights are adjusted by some other 
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means using the available information derived from the array output, array signals, and 
so on to make an estimate of the optimal weights. There are many such schemes and these 
are normally referred to adaptive algorithms. Some are described in this chapter, and their 
characteristics such as the speed of adaption, mean and variance of estimated weights, 
and parameters affecting these characteristics are discussed. Both element space and beam 
space processors are considered. 


3.1 Sample Matrix Inversion Algorithm 

This algorithm estimates array weights by replacing correlation matrix R by its estimate 
[God97], An unbiased estimate of R using N samples of the array signals may be obtained 
using a simple averaging scheme as follows: 


R( N ) = ^£ x ( n ) x >) (3.1.1) 

n=0 

where R(N) denotes the estimated array correlation matrix using N samples, and x(n) 
denotes the array signal sample also known as the array snapshot at the nth instant of 
time with t replaced by nT with T denoting the sampling time. The sampling time T has 
been omitted for ease of notation. 

Let R(n) denote the estimate of array correlation matrix and w(n) denote the array 
weights at the nth instant of time. The estimate of R may be updated when the new 
samples arrive using 

R(n +1) = n h n ) + +1) * H( n +1) (312) 

and a new estimate of the weights w(n + 1) at time instant n + 1 may be made. 

Let P(n) denote the output power at the nth instant of time given by 

P(n) = w H (n)x(n) x H (n)w(n) (3.1.3) 

When N samples are used to estimate the array correlation matrix and the processor 
has K degree of freedom the mean output power is given by [Van91] 

E [ P ( n )] = ^^ P ( 3 ' L4 ) 

where P denotes the mean output power of the processor with the optimal weights, that is, 

P = w h Rw (3.1.5) 

The factor (N - K)/N represents the loss due to estimate of R and determines the conver¬ 
gence behavior of the mean output power. 

It should be noted that as the number of samples grows, the matrix update approaches 
its true value and thus the estimated weights approaches optimal weights, that is, asn-> 
<=°, R(n) -4 R and w(n) ->wor w N , B | : , as the case may be. 
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The expression of optimal weights requires the inverse of the array correlation matrix, 
and this process of estimating R and then its inverse may be combined to update the 
inverse of the array correlation matrix from array signal samples using the Matrix Inversion 
Lemma as follows: 


with 


R _1 (n) = R _1 (n-l) 


R _1 (n -1) x(n) x H (n) R _1 (n -1) 
l + x H (n) R _1 (n-1) x(n) 


(3.1.6) 


R _1 (0) = —I, e 0 >0 
e o 


(3.1.7) 


This scheme of estimating weights using the inverse update is referred to as the recursive 
least squares (RLS) algorithm, which is further discussed in Section 3.9. More discussion 
on the simple matrix inversion (SMI) algorithm is found in [Ree74, Van91, Hor79]. 

Application of SMI to estimate the weights of an array to operate in mobile communi¬ 
cation systems has been considered in many studies [Win94, Geb95, Lin95, Vau88, Has93, 
Pas96]. One of these studies [Lin95] considers beamforming for GSM signals using a 
variable reference signal as available during the symbol interval of the time-division 
multiple access (TDMA) system. Applications discussed include vehicular mobile com¬ 
munications [Vau88], reducing delay spread in indoor radio channels [Pas96], and mobile 
satellite communication systems [Geb95]. 


3.2 Unconstrained Least Mean Squares Algorithm 

Application of least mean squares (LMS) algorithm to estimate optimal weights of an 
array is widespread and its study has been of considerable interest for some time. The 
algorithm is referred to as the constrained LMS algorithm when the weights are subjected 
to constraints at each iteration, whereas it is referred to as the unconstrained LMS algo¬ 
rithm when weights are not constrained at each iteration. The latter is applicable mainly 
when weights are updated using reference signals and no knowledge of the direction of 
the signal is utilized, as is the case for the constrained case. 

The algorithm updates the weights at each iteration by estimating the gradient of the 
quadratic MSE surface, and then moving the weights in the negative direction of the 
gradient by a small amount. The constant that determines this amount is referred to as 
the step size. When this step size is small enough, the process leads these estimated weights 
to the optimal weights. The convergence and transient behavior of these weights along 
with their covariance characterize the LMS algorithm, and the way the step size and the 
process of gradient estimation affect these parameters are of great practical importance. 
These and other issues are discussed in detail in the following. 

A real-time unconstrained LMS algorithm for determining optimal weight w N ^ H of the 
system using the reference signal has been studied by many authors [Wid67, Gri69, Wid76, 
Wid76a, Hor81, Ilt85, Cla87, Feu85, Gar86, Bol87, Fol88, Sol89, Jag90, Sol92, God97] and 
is given by 

w(n +1) = w(n) - pg(w(n)) (3.2.1) 

where w(n + 1) denotes the new weights computed at the (n + l)th iteration, p is a positive 
scalar (gradient step size) that controls the convergence characteristic of the algorithm 
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(i.e., how fast and how close the estimated weights approach the optimal weights), and 
g(w(n)) is an unbiased estimate of the MSE gradient. For a given w(n), the MSE is given 
by (2.5.1), that is. 


^(w(n)) = E r(n + l)" +w H (n)Rw(n)- w H (n)z-z H w(n) 


(3.2.2) 


The MSE gradient at the nth iteration is obtained by differentiating (3.2.2) with respect 
to w, yielding 


V w 5(w)| = 2Rw(n)-2z (3.2.3) 

W^v / lw=w(n) v ' v ' 

Note that at the (n + l)th iteration, the array is operating with weights w(n) computed at 
the previous iteration; however, the array signal vector is x(n +1), the reference signal 
sample is r(n + 1), and the array output 

y(w(n)) = w H (n)x(n + l) (3.2.4) 


3.2.1 Gradient Estimate 

In its standard form, the LMS algorithm uses an estimate of the gradient by replacing R 
and z by their noisy estimates available at the (n + l)th iteration, leading to 

g(w(n)) = 2x(n+l) x H (n + l) w(n)-2x(n+l) r*(n + l) (3.2.5) 

Since the error e(w(n)) between the array output and the reference signal is given by 

e(w(n)) = r(n +1) - w H (n)x(n +1) (3.2.6) 

it follows from (3.2.5) that 

g(w(n)) = -2x(n +1) e* (w(n)) (3.2.7) 

Thus, the estimated gradient is a product of the error between the array output and the 
reference signal and the array signals after the nth iteration. Taking the conditional expec¬ 
tation on both sides of (3.2.5), it can easily be established that the mean of the gradient 
estimate for a given w(n) becomes 


g(w(n)) = 2Rw(n)-2z (3.2.8) 

where g"(w(n)) denotes the mean of the gradient estimate for a given w(n). From (3.2.3) 
and (3.2.8) it follows that the gradient estimate is unbiased. 


3.2.2 Covariance of Gradient 

A particular characteristic of the gradient estimate, which is important in determining the 
performance of the algorithm, is the covariance of the gradient estimate used. To obtain 
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results on the covariance of the gradient estimate given by (3.2.5), making an additional 
Gaussian assumption about the sequence jx(k)) is necessary. Thus, it is assumed that |x(k)} 
is an independent indentically distributed (i.i.d.) complex Gaussian sequence. 

The following result is useful for the analysis to obtain a fourth-order moment of 
complex variables. The result, based on the Gaussian moment-factoring theorem, states 
that [Ree62] when \ lr x^ X3, and x 4 are zero mean, complex jointly Gaussian random 
variables, the following relationship holds: 

EfxjXjXgX^E^x^E^x^ + EjxjX^E^Xj] (3.2.9) 

Now consider the covariance of the gradient estimate given by (3.2.5). By definition, the 
covariance of the gradient for a given w(n) is given by 

v g (w(n)) = E^{g(w(n)) - g(w(n))} {g(w(n)) - g(w(n))} H j 

= E[g(w(n))g H (w(n))] - E[g(w(n))g H (w(n))] ^ 

- E[g(w(n))g H (w(n))] - E[g(w(n))g H (w(n))] 

= E[g(w(n))g H (w(n))] - g(w(n))g H (w(n)) 

The second term on the RHS of (3.2.10) is obtained by taking the outer product of (3.2.8), 
yielding 


g(w(n))g H (w(n)) = 4Rw(n)w H (n)R - 4Rw(n)z H - 4zw H (n)R + 4zz H (3.2.11) 
To evaluate the first term on the RHS of (3.2.10), take the outer product of (3.2.5): 

g(w(n))g H (w(n)) = 4{x(n +1) x H (n +1) w(n)w H (n)x(n + l)x H (n +1) 

- x(n +1) x H (n +1) w(n)r(n + l)x H (n +1) 

(3.2.12) 

- x(n +1) r* (n + l)w(n)x(n +1) x H (n +1) 

+ x(n +1) r* (n + l)r(n + l)x H (n +1)} 

Taking the conditional expectation on both sides one obtains, for a given w(n), 

E[g(w(n))g H (w(n))J = 4E[x(n + l) x H (n+l) w(n)w H (n)x(n+l)x H (n+1)] 

- 4E[x(n +1) x H (n +1) w(n)r(n + l)x H (n +1)1 

(3.2.13) 

- 4E[x(n +1) r * (n + l)w H (n)x(n +1) x H (n +1)] 

+4E[x(n + l) r* (n + l)r(n + l)x H (n +1)] 

Consider the fourth term on the RHS of (3.2.13), and define a matrix: 
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A = E[x(n +1) r* (n + l)r(n + l)x H (n +1)] 

It follows from (3.2.14) and (3.2.9) that 

Ajj = e[x ; (n +1) r* (n + l)r(n + l)x* (n +1)] 

= E[x;(n + l) r*(n + l)]E[r(n + lK(n + l)] 

+ e[x ; (n + l)x* (n +1)] E[r * (n + l)r(n +1)] 

This along with (2.5.2) implies that 

A = zz H + Rp r 

where 

p r =E[r*(n)r(n)] 

is the mean power of the reference signal. 

Similarly evaluating the other terms on the RHS of (3.2.13), 

E[s( w (n))g H ( w(n))J = 4w H (n)Rw(n)R + 4Rw(n)w H (n)R 

- 4Rw(n)z H - 4Rz H w(n) 

- 4zw H (n)R - 4Rw H (n)z 
+ 4zz h + 4Rp r 

Subtracting (3.2.11) from (3.2.18) and using (3.2.2), 

V (w(n)) = 4R{w H (n)Rw(n)- z H w(n)- w H (n)z + p r J 
= 4R^(w(n)) 

where £(w(n)) is the MSE given by (3.2.2). 


(3.2.14) 


(3.2.15) 


(3.2.16) 


(3.2.17) 


(3.2.18) 


(3.2.19) 


3.2.3 Convergence of Weight Vector 

In this section, it is shown that the mean value of the weights estimated by (3.2.1) using 
the gradient estimate given by (3.2.5) approaches the optimal weights in the limit as the 
number of iterations grows large. For this discussion, it is assumed that the successive 
array signal samples are uncorrelated. This is usually achieved by having a sufficiently 
long iteration cycle of the algorithm. Substituting from (3.2.5) in (3.2.1), it follows that 

w(n +1) = w(n) - 2px(n +1) x H (n +1) w(n) + 2px(n +1) r*(n +1) (3.2.20) 
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Equation (3.2.20) shows that w(n) is only a function of x(0), x(l), ..., x(n). This along 
with the assumption that the successive array samples are uncorrelated implies that w(n) 
and x(n + 1) are uncorrelated. Hence, taking the expected value on both sides of (3.2.20), 


w(n +1) = w(n) - 2pE[x(n +1) x H (n +1)] w(n) + 2pE[x(n +1) r*(n +1)] 

= w(n)-2|iR w(n) + 2|iz (3.2.21) 

= [I - 2)_iR]w(n) + 2pz 

where 

w(n) = E[w(n)] (3.2.22) 

Define a mean error vector v(n) as 

v(n) = w(n)-w MSE (3.2.23) 

where w^ is the optimal weight given by (3.3), that is, 

w mse = R- 1z (3.2.24) 

It follows from (3.2.23) that w(n) is given by 

w(n) = v( n ) + w MSE (3.2.25) 

Substituting for w(n) in (3.2.21), 

v(n +1) = [I - 2pR]v(n) - 2pRw MSE + 2pz (3.2.26) 

Noting from (3.2.24) that 

z = R w MSE (3.2.27) 

it follows from (3.2.26) that 


v(n +1) = [I - 2pR]v(n) 

= [l-2pR] n+1 v(0) 


(3.2.28) 


The behavior of the RHS of (3.2.28) can be explained better by converting it in diagonal 
form, which can be done by using the eigenvalue decomposition of R given by (2.1.29). 
In the following, (2.1.29) is rewritten: 


R = QAQ h (3.2.29) 

where A is a diagonal matrix of the eigenvalues of R and Q is given by (2.1.31). It is a 
matrix, with columns being the eigenvectors of R. 
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Substituting for R in (3.2.28), 


v(n +1) = [i - 2gQAQ H ] n+1 v(0) (3.2.30) 

Equation (3.2.30) may be rewritten in the following form, using indexing: 


v(n +1) = Q[I - 2gA] n+1 Q H v(0) (3.2.31) 

For n = 0, it follows from (3.2.30) using QQ H = I that 


v(l) = [l-2gQAQ H ]v(0) 
= Q[l-2gA]Q H v(0) 


(3.2.32) 


Thus, (3.2.31) holds for n = 0. For n = 1, it follows from (3.2.30) using Q 1 ! Q = I that 


v(2) = [l-2gQAQ H ] 2 v(0) 

= [I - 2g QA Q H - 2g QA Q H + 4g 2 QA Q H QA Q H ] v(0) 

(3.2.33) 

= Q[l - 2gA - 2gA + 4g 2 A 2 ]Q H v(0) 

= Q[I-2gA] 2 Q H v(0) 

Thus, (3.2.31) holds for n = 1. If (3.2.31) holds for any n, that is. 


v(n) = Q[l - 2gA] n Q h v(0) 


(3.2.34) 


then 


v(n +1) = [i - 2gQA Q h ] n+1 v(0) 

= [l-2gQA Q H ] n [l-2|iQA Q h ]v(0) 

= Q[I - 2gA] n Q h [I - 2gQA Q H ]v(0) 

= {Q[I - 2gA] n Q h - 2jliQ[I - 2gA] n Q H QAQ H J v(0) (3.2.35) 

= {Q[I - 2qA] n Q h - 2nQ[l - 2gA] n AQ H } v(0) 

= {Q[l-2gA] n [l-2gA]Q H }v(0) 

= Q[l-2gAf +1 Q H v( 0 ) 
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and it holds for n + 1. Thus, by indexing it is proved that (3.2.30) may be rewritten in the 
form of (3.2.31). The quantity in the square bracket on the RHS of (3.2.31) is a diagonal 
matrix with each entry (1 - 2|iA,i), with i = 1, ..., L being the L eigenvalues of R. 

For (a. < I / k max , with k max denoting the maximum eigenvalue of R, the magnitude of 
each diagonal element is less than 1, that is, 

;1 2(.iX i <lVi (3.2.36) 

Hence, as the iteration number increases, each diagonal element of the matrix in the square 
bracket diminishes, yielding 


lim v(n) = 0 (3.2.37) 

n—>oo 

This along with (3.2.23) implies that 

lim w(n) = w MSE (3.2.38) 

Thus, for (a. < 1 /k max , the algorithm is stable and the mean value of the estimated weights 
converges to the optimal weights. As the sum of all eigenvalues of R equals its trace, the 
sum of its diagonal elements, the gradient step size (i can be selected in terms of measurable 
quantities using p < 1 /Tr(R), with Tr(R) denoting the trace of R. It should be noted that 
each diagonal element of R is equal to the average power measured on the corresponding 
element of the array. Thus, for an array of identical elements, the trace of R equals the 
power measured on any one element times the number of elements in the array. 


3.2.4 Convergence Speed 

The convergence speed of the algorithm refers to the speed by which the mean of the 
estimated weights (ensemble average of many trials) approaches the optimal weights, and 
is normally characterized by L trajectories along L eigenvectors of R. To obtain the conver¬ 
gence time constant along an eigenvector of R, consider the initial mean error vector v(0) 
and express it as a linear combination of L eigenvectors of R, that is, 

L 

v(0) = £a 1 U i (3.2.39) 

i=l 

where oq, i = 1, 2, ..., L are scalars and U ; , i = 1, 2, ..., L are eigenvectors corresponding 
to L eigenvalues of R. 

Substituting from (3.2.39) in (3.2.31) yields 


v(n +1) = Q[I - 2pA] n+1 Q h “A 

i=l 


Since eigenvectors of R are orthogonal, (3.2.40) can be expressed as 


(3.2.40) 
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(3.2.41) 


v(n +1) = a iQ[ T “ 2 ^ A ] n+1 Q Hu i 

i=l 

= y^a i [l-2^A, i ] n+1 U i 

i=l 

The convergence of the mean weight vector to the optimal weight vector along the ith 
eigenvector is therefore geometric, with geometric ratio 1 - 2|aA, i . If an exponential envelope 
of the time constant Tj is fitted to the geometric sequence of (3.2.41), then 


. -1 
i_ ln(l-2pA, ; ) 


(3.2.42) 


where In denotes the natural logarithm and the unit of time is assumed to be one iteration. 
The negative sign in (3.2.42) appears due to the fact that the quantity in parentheses is 
less than unity and the logarithm of that is a negative quantity. 

Note that if 


2\i\ < 1 (3.2.43) 

the time constant of the ith trajectory may be approximated to 


1 


(3.2.44) 


Thus, these time constants are functions of the eigenvalues of the array correlation matrix, 
the smallest one dependent on X TOX , which normally corresponds to the strongest source 
and the largest one controlled by the smallest eigenvalue that corresponds to the weakest 
source or the background noise. Therefore, the larger the eigenvalue spread, the longer it 
takes for the algorithm to converge. In terms of interference rejection capability, this means 
canceling the strongest source first and the weakest last. 

The convergence speed of an algorithm is an important property and its importance for 
mobile communications is highlighted in [Nag94] by discussing how the LMS algorithm 
does not perform as well as some other algorithms due to its slow convergence speed in 
situations of fast-changing signal characteristics. Time availability for an algorithm to 
converge in mobile communication systems not only depends on the system design, which 
dictates duration of the user signal present such as the user slot duration in a TDMA 
system, it is also affected by the speed of mobiles, which changes the rate at which a signal 
fades. For example, a mobile on foot would cause the signal to fade at a rate of about 
5 FLz, whereas it would be of the order of about 50 FIz for a vehicle mobile, implying that 
an algorithm needs to converge faster in a system being used by vehicle mobiles compared 
to the one used in a handheld portable [Win87]. Some of these issues for the IS-54 system 
are discussed in [Win94] where the convergence of the LMS and the SMI algorithms in 
mobile communication situations is compared. 

Even when the mean of the estimated weights converges to optimal weights, they have 
finite covariance, that is, their covariance matrix is not identical to a matrix with all 
elements equal to zero. This causes the average of the MSE not to converge to the minimum 
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MSE (MMSE) and leads to excess MSE. Convergence of the weight covariance matrix and 
the excess MSE is discussed in following sections. 


3.2.5 Weight Covariance Matrix 

The covariance matrix of the weights at the nth iteration is given by 


k ww( n ) = E[(w(n)- w(n))(w(n)- w(n)) H ] 
= E[w(n)w H (n)] - [w(n)w H (n)j 
+ E[w(n)w H (n)] - E[w(n)w H (n)] 


(3.2.45) 


= R ww( n )- w ( n ) wH ( n ) 

where expectation is unconditional and taken over w, 

w(n) = E[w(n)] (3.2.46) 

and 

R ww ( n ) = E[w(n) w H (n)] (3.2.47) 

In this section, a recursive relationship for the weight covariance matrix is derived. The 
relationship is useful for understanding the transient behavior of the matrix. 

It follows from (3.2.45) that 

k ww ( n + !) = R ww ( n + !) - w(n +1)w H (n +1) (3.2.48) 


and from (3.2.47) that 


R ww( n + 1) = E[w(n + l)w H (n +1)] (3.2.49) 

Substituting from (3.2.1) in (3.2.49), 

R ww( n +1) = R w» + ki 2 E[g(w(n))g H (w(n))] 

(3.2.50) 

- h R [g( w(n)) w H (n)] - pE[w(n)g H (w(n))] 

Taking unconditional expectation on both sides of (3.2.10), and rearranging, it follows 
that 


E [g( w ( n ))g H ( w ( n ))] = E [ v g ( w ( n ))] + E [g( w ( n ))g H ( w ( n ))] (3.2.51) 

where g(w(n)) is the mean value of the gradient estimate for a given w(n). An expression 
for g(w(n)) is given by (3.2.8). From (3.2.8), taking the outer product of g(w(n)). 
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E[g( w(n))g H ( w(n))] = E^{2Rw(n) - 2z}{2Rw(n) - 2z} H 

= 4 { RR ww( n ) R + zzIi - Rw(n)z H - zw h rJ 

(3.2.52) 

From (3.2.5), 

E[w(n)g H (w(n)) 

] = 2R ww (n)R-2w(n)z H 

(3.2.53) 

and 

E [g( w ( n )) wH ( n ) 

] = 2RR ww( n )- 2zwH ( n ) 

(3.2.54) 


From (3.2.50) to (3.2.54) it follows that 


R ww( n +1) = R ww( n ) + h 2 E[V g (w(n))] 

+ 4p 2 {RR ww (n)R + zz H - Rw(n)z H - zw h r} (3.2.55) 

- 2p{RR ww (n) - zw H (n) + R ww (n)R - w(n)z H } 

Evaluation of (3.2.48) requires the outer product of w(n + 1). From (3.2.1) and (3.4.8), 

w(n +1) = w(n) - 2|iR w(n) + 2pz (3.2.56) 

and thus 


w(n +1) w H (n +1) = w(n)w H (n) 

+ 4p 2 {Rw(n)w H (n)R + zz H - Rw(n)z H - ziv H (n)R} (3.2.57) 
- 2p{w(n)w H (n)R + Rw(n)w H (n) - w(n)z H - zw H (n)} 
Subtracting (3.2.57) from (3.2.55) and using (3.2.48), 


k ww ( n + l) = k ww( n ) + 4p 2 Rk ww (n)R - 2p.{Rk ww (n) + k^ (n)R} 
+ p 2 E[v (w(n))] 


(3.2.58) 


Thus, at each iteration the weight covariance matrix depends on the mean value of the 
gradient covariance used at the previous iteration. Equation (3.2.58) may be further sim¬ 
plified by substituting for V g (w(n)) from (3.2.19). Taking the expectation over w on both 
sides of (3.2.19), 
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E|V g (w(n))j = 4RE[w H (n)Rw(n) - z H w(n) - w H (n)z + p r ] 

= 4RjE[w H (n)Rw(n)j - z H w(n) - w H (n)z + p r J 


(3.2.59) 



Using 


E[w H (n)Rw(n)J = E^Tr[Rw(n)w H (n)jj 
= Tr[E[Rw(n )w H (n)]] 

= Tr[RR ww (n)] 

= Tr [ Rk ww( n ) + Rw(n)w H (n)] 

= Tr [ Rk ww( n )] + w H (n)Rw(n) 

(3.2.59) becomes 

E[v g (w(n))] = 4RTr[Rk w »] + 4R^(w(n)) 

where 

£('w(n)) = w H (n)Rw(n) - z H w(n) - w H (n)z + p r 

and Tr[.] denotes the trace of [.]. 

Substituting (3.2.61) in (3.2.58), 

k (n +1) = k (n) + 4u 2 Rk (n)R + 4u 2 RTr[Rk (n)l 
- 2g{Rk ww (n) + k ww (n)R} + p 2 4R^(w(n)) 


(3.2.60) 


(3.2.61) 


(3.2.62) 


(3.2.63) 


Thus, at the (n + l)st iteration the weight covariance matrix is a function of q(w(n)). 


3.2.6 Transient Behavior of Weight Covariance Matrix 

In this section, the transient behavior of the weight covariance matrix is studied by 
deriving an expression for k mv (n) and its limit as n —> Define 

E(n) = Q H k ww (n)Q (3.2.64) 

By pre- and postmultiplying by Q H and Q on both sides of (3.2.63), and using 

QQ h = I (3.2.65) 

it follows that 

Q H k ww ( n + 1)Q = Q Hk ww (n)Q + 4p 2 Q H RQQ H k ww (n)QQ H RQ 
+ 4p 2 Q H RQTr[Q H RQQ H k ww (n)Q] 

(3.2.66) 

- 2p{Q H RQQ H k ww (n)Q + Q H k ww (n)QQ H RQ} 

+ p 2 4Q H RQ^(w(n)) 
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Using Q H RQ = A and (3.2.64) in (3.2.66), the following matrix difference equation is 
derived: 


£(n +1) = £(n) + 4p 2 A 2 £(n) + 4p 2 ATr[A£(n)] 
- 4pA£(n) + g 2 4A^( w(n)) 


(3.2.67) 


Now it is shown by induction that L(n), n > 0 is a diagonal matrix. Consider n = 0. Since 
the initial weight vector is w(0), it follows from (3.2.45) to (3.2.47) and (3.2.64) that 


£(0) = 0 (3.2.68) 

From (3.2.67) and (3.2.68), 

£(1) = p 2 4A^(w(0)) (3.2.69) 

As A is a diagonal matrix, it follows from (3.2.69) that L(l) is a diagonal matrix. Thus, 
L(n) is diagonal for n = 0 and 1. If L(n) is diagonal for any n, then it follows from (3.2.67) 
that it is diagonal for n + 1. Thus, L(n), n > 0 is a diagonal matrix. 

As Q is a unitary transformation, it follows that the diagonal elements of L(n) are the 
eigenvalues of k ww (n). Let these be denoted by r|i( n )/1 = 1, ..., L. Defining 

\ = [V..,^ l ] T (3-2.70) 

and 

Tl( n ) = hi(n), ■ ■ -, r| L ( n )] T (3.2.71) 

to denote the eigenvalues of R and L(n), respectively, 

Tr[A£(n)] = X\i(n) (3.2.72) 

Substituting (3.2.72) in (3.2.67), the vector difference equation for the eigenvalues of L(n) 
is 


r|(n +1) = [i - 4pA + 4p 2 A 2 + 4p 2 \A. r }nri(n) + 4p 2 ^( w(n))\ 


With 


H = 4pA - 4p 2 A 2 - 4p 2 A\ T 


equation (3.2.73) has the solution 


-n(n) = {I - H} n T,(0) + 4p 2 \£ {I - H}‘ _1 ^(w(n - i)) 

i=l 


(3.2.73) 


(3.2.74) 


(3.2.75) 
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Since Q diagonalizes k ivw (n), it follows that 


k ww (n) = ^h 1 (n)U 1 Uf (3.2.76) 

1=1 


where U„ 1 = 1, ..., L are the eigenvectors of R. Equation (3.2.76) describes the transient 
behavior of the weight covariance matrix. 

The next section shows that lim A 1 in(n) exists under the conditions noted there. This, 

n-»x 

along with the fact that 0 < A; < °°Vj, implies that lim r|(n) exists. It then follows from 
(3.2.73) and (3.2.74) that 


lim iq(n) = 4p 2 ^H A 


_hL 

'-It 

i=l 


“PA; 


1-pAj 

1 

1-pAL 


(3.2.77) 


where \ is the minimum MSE given by (2.5.6). This along with (3.2.76) implies that an 
expression for the steady-state weight covariance matrix is given by 


limk ww( n ) = 


hi 


'-It 


-pAi 


i_i 

It 


1 

-pA, 


u,ur 


(3.2.78) 


3.2.7 Excess Mean Square Error 

From the expressions of MSE given by (2.5.1), it follows that for a given w(n), 

^(w(n)) = | + v H (n)Rv(n) (3.2.79) 

where q is the minimum MSE, v(n) is the error vector at the nth iteration denoting the 
difference between estimated weights w(n) and the optimal weights w, and v H (n)Rv(n) is 
the excess MSE. 

Taking the expected value over w on both sides of (3.2.79), the average value of the MSE 
at the nth iteration is derived, that is, 

q(n) = l + e[v h (n)Rv(n)] (3.2.80) 

where 

|(n) = E^(w(n))] (3.2.81) 

and E[v tl (n)Rv(n)] denotes average excess MSE at the nth iteration. 
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Taking the limit as n —> x yields the steady-state MSE, that is. 


q(co) = lim q(n) 
n— 

= | + limE[v H (n)Rv(n)j 


(3.2.82) 


Note that as n -» °o E[v(n)] -h> 0 but the average value of the excess MSE does not approach 
zero, that is, lim E[v H (n)Rv(n)] =£ 0. Now let us discuss the meaning of this quantity. 

Substituting for v(n) in (3.2.79), 


E[v H (n)Rv(n)j = E[w H (n)Rw(n)] + w H (n)Rw(n) 

- w H (n)Rw(n) - w H (n)Rw(n) 

Consider the mean output power of the processor for a given w, that is. 


(3.2.83) 


P(w(n)) = w H (n)Rw(n) 


Taking the expectation over w, it gives the mean output power at the nth iteration P(n), 
that is. 


P(n) = E[p(w(n))] 

= E[w H (n)Rw(n)] 

This along with (3.2.83) yields 

E[v H (n)Rv(n)] = P(n) + w H (n)Rw(n) 

- w H (n)Rw(n) - w H (n)Rw(n) 

that in the limit becomes 


(3.2.84) 


(3.2.85) 


limE[v H (n)Rv(n) = P(°°)-w H (n)Rw(n) (3.2.86) 

n—L J 


Thus, the steady-state average excess MSE is the difference between the mean output 
power of the processor in the limit P(x) and the mean output power of the optimal 
processor, w H (n)Rw(n). It is the excess power contributed by the weight variance in the 
steady state. 

Next, an independent expression for the steady-state average excess MSE is derived. 
Using (3.2.60) and the notation of the previous section, it follows that 


E[w H (n)Rw(n)] = Tr[Rk ww (n)] + w H (n)Rw(n) 

= Tr[Q H RQQ H k ww (n)Q] + w H (n)Rw(n) 


(3.2.87) 
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(3.2.87) 


= Tr[A£(n)] + w H (n)Rw(n) 

= X T iri(n)+ w H (n)Rw(n) 

Substituting from (3.2.87) in (3.2.83), 

E|v H (n)Rv(n)j = A. T ri(n) + v H (n)Rv(n) (3.2.88) 

Taking the limits on both sides, this becomes 

lim E[v H (n)Rv(n)l = lim \'ir)(n) (3.2.89) 

n —l J n—>°o 


It should be noted that (3.2.89) only holds in the limit. At the nth iteration, the average 
excess MSE v H (n)Rv(n) is not equal to X'r|(n). A relationship between the two quantities 
is given by (3.2.88). Appendix 3.1 shows that 


lim X T ri(n) 




X, 




i-nV-V 


(3.2.90) 


Thus, we have the following result for the steady-state average excess MSE, lim 
Ejv^njRvjn)]. If p satisfies 


and 


then 


0<p< 





<1 


lim E[v H (n)Rv(n) 




i=l 

“T 




-gX; 

“PA; 


(3.2.91) 


(3.2.92) 


(3.2.93) 


3.2.8 Misadjustment 

The difference between the weights estimated by the adaptive algorithm and optimal 
weights is further characterized by the ratio of the average excess steady-state MSE and 
the MMSE. It is referred to as the misadjustment [Wid66]. It is a dimensionless parameter 
and measures the performance of the algorithm. The misadjustment is a kind of noise. 
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and is caused by the use of the noisy estimate of the gradient. This noise is referred to as 
the misadjustment noise. 

Let M denote the misadjustment. Thus, by definition 


M = 


U°±4 

i 


(3.2.94) 


It follows from (3.2.94), (3.2.82), (3.2.93) that when the gradient is estimated by multiplying 
the array signals with the error between the array output and the reference signal, and 
the gradient step size selected such that (3.2.91) and (3.2.92) hold, then the misadjustment 
My for the unconstrained LMS algorithm is given by 


M 


u 




1 — jaA-i 


(3.2.95) 


For a sufficiently small (i, this results in 

M tJ =pTr[R] 


(3.2.96) 


It follows from this expression that increasing p increases the misadjustment noise. On 
the other hand, an increase in g causes the algorithm to converge faster as discussed earlier. 
Thus, the selection of the gradient step size requires satisfying conflicting demands of 
reaching the vicinity of the solution point quicker but wandering around over a larger 
region causing a bigger misadjustment and arriving near the solution point slowly with 
the smaller movement in the weights at the end. The latter causes an additional problem, 
particularly in nonstationary environments, say when interference is slowly moving, 
where the optimal solution moves, causing slowly adapting estimated weights to lag 
behind the optimal weights. This phenomenon is referred to as the weight vector lag. 

Many schemes including variable step size have been suggested to overcome this prob¬ 
lem [Soo91, Pri91, Yas87, Eva93, Kwo92, Kwo92a, Har86, Che90]. Some of these are briefly 
discussed. 

The adaptive algorithm estimates the weights by minimizing the MSE. Thus, in schemes 
where a variable step size is used, it reflects the value of the MSE at that iteration, going 
up and down as the MSE goes up and down such that it stays between the maximum 
permissible value for convergence and the minimum value based on the allowed misad¬ 
justment. It may be truly variable or may be allowed to switch between a few preselected 
values for the ease of implementation as well as by shifting by one bit left or right where 
digital implementation is used. The step size may also be adjusted to reflect the change 
in the direction of the error surface gradient at each iteration [Har86]. 

The optimal value of the step size at each step is suggested in [Yas87] such that it 
minimizes the MSE at each iteration. This is a function of the value of the true gradient 
at each iteration and the array correlation matrix. In practice, these may be replaced by 
their instantaneous values, leading to a suboptimal value. 

Instead of having a single step size for a whole weight vector, a variable step size can 
be selected for each weight separately, leading to increased convergence of the algorithm 
[Eva93]. The convergence speed of the algorithm may also be increased by adjusting 
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weights such that interferences are canceled one at a time [Ko93, Ko93a], and by using a 
scheme known as block processing [Ben92], For broadband signals, an implementation in 
the frequency domain may help increase the speed of convergence. 

The application of frequency domain beamforming to estimate the weights using the 
LMS algorithm for the case when a reference signal is available shows [Den78, Nar81, 
FI 088 , Ber 86 ] how the frequency domain approach yields improved convergence and 
reduced computational complexities compared to the time domain approach. Improved 
convergence normally arises from the use of different gradient step sizes in different bins. 
For the constrained LMS case, this is likely to cause deterioration in the steady-state 
performance of the algorithm. This, however, does not affect the performance of the 
unconstrained algorithm [Feu93]. 

The "sign algorithm," in which the error between the array output and the reference 
signal is replaced by its sign, is computationally less complex than the LMS algorithm, as 
discussed in [Che90, Mat87]. 

The algorithm is usually analyzed assuming that successive samples are uncorrelated. 
This assumption helps in simplifying the mathematics by allowing expectations of data 
products to be replaced by the products of their expectations. Discussion of correlated 
samples in nonstationary environment may be found in [Ber84, Ber85, Ewe90]. Applica¬ 
tions of the unconstrained LMS algorithm to mobile communication systems using an 
array include base mobile communication systems [Win84], indoor radio systems [Win87], 
and satellite-to-satellite communication systems [Jon95]. 


3.3 Normalized Least Mean Squares Algorithm 

This algorithm is a variation of the constant-step-size LMS algorithm and uses a data- 
dependent step size at each iteration [God97], At the nth iteration, the step size is given by 


p(n) 


hp 

x» x(n) 


(3.3.1) 


where p 0 i s a constant. The algorithm and its convergence using various types of data 
have been studied widely [Nit85, Nit 86 , Ber 86 a, Slo93, Rup93]. It avoids the need for 
estimating the eigenvalues of the correlation matrix or its trace for selection of the maxi¬ 
mum permissible step size. The algorithm normally has better convergence performance 
and less signal sensitivity compared to the normal LMS algorithm. See [Bar94] for discus¬ 
sion of its application to mobile communications. 


3.4 Constrained Least Mean Squares Algorithm 

A real-time constrained algorithm [Hud81, Fro72, Can80, God83, God 86 , God89, God90, 
God93, God97, Mos70] for determining the optimal weight vector w is 

w (n +1) = P{w(n) - pg(w(n))} + ^ (3.4.1) 
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FIGURE 3.1 

Constrained LMS algorithm: a pictorial view of projection process. 

where 


C c H 

p = i __ 3 «2cl (3.4.2) 

is a projection operator, g(w(n)) is an unbiased estimate of the gradient of the power 
surface w^njRwjn) with respect to w(n) after the nth iteration; |i is the gradient step size, 
a positive scalar constant that controls the characteristics of the adaptive algorithm; and 
S 0 is the steering vector in the look direction. 

The algorithm is called constrained because the weight vector satisfies the constraint at 
every iteration, that is, w [ *(n)S 0 = 1, Vn. The process of imposing constraints may be 
understood from Figure 3.1. It shows how weights are updated and how the projection 
system works using a vector diagram for a two-weight system [Fro72], The figure shows 
constant power contours; the constraint surface (a line w H S 0 = 1 for a two-dimensional 
system); a surface parallel to the constraint surface passing through the origin (w H S 0 = 0); 
weight vectors w(n), w(n + 1), and w; and the gradient at the nth iteration. 

Point A on the diagram indicates the position of the weight after completion of the nth 
iteration. It is the cross-section of the constraint equation w H S 0 = 1 and the power surface 
wn(n)Rw(n) (not shown in the figure). The weights are perturbed by adding a small 
amount -|ig(w(n)) and then are projected on w H S 0 = 0 using projection operator P. This 
is indicated by point B on the diagram. Note that PS 0 = 0; thus the projection operator 
projects the weights orthogonal to S 0 . The vector S 0 /L is added to restore the constraint. 
This action moves the updated weights w(n + 1) to point C. The process continues by 
moving the estimated weights toward point D, the optimal solution. 

The effect of the gradient step size |i on the convergence speed and misadjustment noise 
may also be understood using Figure 3.1. A larger step size means that the weight vector 
moves faster toward point D, the solution point, but wanders around it over a larger 
region, not closely approaching and causing more misadjustment. 

The gradient of w^njRwjn) with respect to w(n) is given by 

V w (w H Rw)| = 2Rw(n) (3.4.3) 

' 'lw=w(n) 


© 2004 by CRC Press LLC 







and its computation using this expression requires knowledge of R, which normally is 
not available in practice. A typical scheme to estimate the required gradient is to replace 
R by its noisy sample x(n + 1) x 1 '(n + 1) available at time instant (n + 1). 

There are a number of schemes used for estimating the required gradient [Fro72, Can80, 
God83, God86, God90, God89]. Even though the estimated gradient in each case is unbi¬ 
ased, the covariance of the estimated gradient obtained with each method is different, and 
thus the transient and steady-state behavior of the constrained algorithm is different in 
each case. In the following sections, some of these methods are described and the behavior 
of the algorithm in each case is examined. 

First, the normal gradient estimation scheme where R is replaced by its noisy sample is 
discussed, and the algorithm in this case is referred to as the standard LMS algorithm to 
differentiate it from the algorithm when a gradient estimated by different methods is used. 

In the next section, the gradient estimation scheme used by the standard LMS algorithm 
is described, and then some properties of the gradient are discussed along with the 
convergence of the weights estimated by the algorithm to the optimal weights and the 
study of the misadjustment [God86, God93]. 


3.4.1 Gradient Estimate 

When all receiver outputs are accessible, the usual estimate of the gradient is made by 
multiplying the array output by the receiver output, that is. 


g(w(n)) = 2x(n + l)y* (w(n)) (3.4.4) 

In obtaining this estimate, the array correlation matrix has been replaced by x(n + l)x J ‘(n + 
1 ), which is a noisy sample of the array correlation matrix at the time instant (n + 1). 

If jx(n)} is a zero-mean, stationary complex vector process, then for a given w(n) the 
estimate of the gradient defined by (3.4.4) is unbiased, that is, 

E [g( w ( n ))| w ( n )] = E [2x(n + l)y* [w(n)]] 

= 2E[x(n + l)x H (n + l)w(n)] (3.4.5) 

= 2Rw(n) 


3.4.2 Covariance of Gradient 

The covariance of the gradient estimate used in the weight update equation is important 
in determining the performance of the algorithm, as was discussed previously. To obtain 
results on the covariance of the gradient estimate defined by (3.4.4), it is necessary to make 
an additional Gaussian assumption about the sequence {x(k)[. Thus, if jx(k)) is an i.i.d. 
complex Gaussian sequence, then V fe (w(n)), the covariance of the gradient estimated by 
this method for a given w(n), is given by 


V g (w(n)) = 4w H (n)Rw(n)R (3.4.6) 

A derivation of (3.4.6) is presented in Appendix 3.2. 
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It follows from the expression that the covariance at the nth iteration is proportional to 
the mean output power of the processor for a given w(n), the quantity that the gradient 
algorithm is trying to minimize. Thus, the gradient estimate improves as the weight vector 
approaches the optimal value. 


3.4.3 Convergence of Weight Vector 

In this section, results on convergence of the estimated weights to the optimal weights 
are presented. The derivation of these results appears in Appendix 3.3. 

Let X mK denote the maximum eigenvalue of PRP and X, denote the ith eigenvalue of PRP. 
If {x(k)} is an i.i.d. Gaussian sequence, and w tl (0)w(0) < °° and 


then 


0<p< 


1 


i 


max 


(3.4.7) 


lim E[w(n)] = w (3.4.8) 


and the convergence of E[w(n)] to w along the ith eigenvector of PRP has the following 
time constant: 


x, =- 


in 


-1 

l-2pi 


(3.4.9) 


where ln[ ] denotes the natural logarithm. 

Thus, the mean value of the estimated weights converges to the optimal weights in the 
limit provided that one starts with a bounded initial weight vector and the gradient step 
size is small enough to satisfy the condition (3.4.7). It should be noted that upper limit on 
the gradient step size, as well as convergence speed, depend on PRP. It follows from 

R = p s S 0 S» + R N (3.4.10) 


and 


PS 0 =0 (3.4.11) 

that PRP = PR n P, and hence the convergence speed of the mean value of weights charac¬ 
terized by the time constants and the upper limit on the gradient step size only depend 
on the eigenvalues of PR N P, indicating that the signal arriving from the look direction 
does not affect these quantities. The eigenvalues of PR N P are functions of the directions 
and powers of directional sources as well as the array geometry with the maximum 
eigenvalue being controlled by the strongest source governing the initial convergence 
speed. The latter part of the convergence is controlled by the smaller eigenvalues associ¬ 
ated with weak sources or background noise, and thus the overall speed of the algorithm 
depends on the eigenvalue spread of PR N P. 
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The discussion thus far has concentrated on the convergence of the mean value of 
weights to optimal weights. The variance of these weights is an important parameter and 
the transient and steady-state behavior of the weight covariance matrix k ww (n) are indicators 
of algorithm performance as discussed previously for the unconstrained LMS algorithm. 


3.4.4 Weight Covariance Matrix 

The weight covariance matrix is defined as 


k ww( n ) = E ( w ( n ) - w(n))(w(n) - w(n )f 


(3.4.12) 


where 


w(n) = E[w(n)] (3.4.13) 

Appendix 3.4 shows that the matrix satisfies the following recursive relations. If V g w(n)) 
denotes the covariance of the gradient used in the constrained LMS algorithm for a given 
w(n), and k^fn) denotes the covariance of w(n), then 


k ww (n +1) = Pk ww (n)P - 2pP[Rk ww (n) + k ww (n)R]P 
+ 4p 2 PRk ww (n)RP + p 2 PE[V g (w(n))]p 


(3.4.14) 


where the expectation is taken over w. 

The weight covariance matrix at each iteration depends on the mean value of the 
covariance of the gradient used at the previous iteration. Equation (3.4.14) may be further 
simplified by substituting for V g (w(n)). Taking expectation over w(n), pre- and post- 
multiplying by P on both sides of the expression for the covariance of the gradient given 
by (3.4.6) and using (3.2.60), 


PE[V gs (w(n))]p = 4PRPE[w H (n)Rw(n)] 

= 4PRP{Tr[Rk ww (n)] + k 0 (n)} 


(3.4.15) 


where 


k Q (n) = w H (n)Rw(n) 

Equations (3.4.14) and (3.4.15) imply that 

k ww(n + 1) = PK ww (n)P - 2p{PRK ww (n) + k ww (n)RP} 

+ 4p 2 PRk ww (n)RP + 4p 2 PRP{Tr[Rk ww (n)] + k 0 (n)} 


(3.4.16) 


(3.4.17) 
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3.4.5 Transient Behavior of Weight Covariance Matrix 

The study of the convergence and transient behavior of the weight covariance matrix 
presented here requires that the matrix be diagonalized. Conditions required for diago- 
nalization of the weight covariance matrix by the transformation, which also diagonalizes 
PRP, are described below. 

The necessary and the sufficient condition for the diagonalization of k ww (n + 1), n > 0, 
and PRP by the same unitary transformation is that the unitary transformation also 
diagonalizes PE[V g (w(n))]P for all n, where V g (w(n)) is the covariance of g(w(n)) for a 
given w(n) and the expectation is taken over w. A proof of the diagonalization conditions 
is presented in Appendix 3.5. 

Thus, to verify that the weight covariance matrix for the standard algorithm is diago- 
nalizable by the same unitary transformation that diagonalizes PRP, we need to test if this 
transformation diagonalizes PE[V gg (w(n))]P. Since PRP is a Hermitian matrix, a unitary 
matrix Q exists, such that 


Q h PRPQ = A (3.4.18) 

where A is a diagonal matrix with its diagonal elements being the eigenvalues of PRP. 

It follows from (3.4.15) and (3.4.18) that 

Q H PE[V gs (w(n))]pQ = 4ATr[Rk ww (n)] + 4Ak 0 (n) (3.4.19) 

This implies that V & (w(n)) satisfy the conditions required for the diagonalization of k mv (n). 
Thus, Q 1 'k^frOQ is a diagonal matrix when the covariance of the gradient used for updat¬ 
ing w(n) is given by (3.4.6). Let this be denoted by diagonal matrix E(n), that is, 

E(n) = Q H k ww (n)Q (3.4.20) 

Now the transient behavior of E(n) is analyzed. To study the transient behavior of E(n), 
a matrix difference equation for E(n) is developed, a vector difference equation for its 
diagonal terms is derived, and its solution is presented. 

Pre- and postmultiplying (3.4.17) by Q H and Q, noting that 

P 2 = P (3.4.21) 

k (n) = Pk (n)P (3.4.22) 

and using (3.4.20), the following matrix difference equation is derived: 

E(n +1) = E(n) - 4pAE(n) + 4p 2 A 2 E(n) 

(3.4.23) 

+ 4p 2 A{Tr(AE(n)) + k 0 (n)} 

Let the two L-dimensional vectors X and r|(n) represent the L eigenvalues of PRP and 
k ww ( n ), respectively, that is. 
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iT 


(3.4.24) 


A = 


X 2 ,. 



and 


il(n) = [r|i, n 2 , ..., t1l] T (3.4.25) 

where X; and q^n), i = 1, 2, L, are the eigenvalues of PRP and k vvw (n), respectively. 
From (3.4.23) to (3.4.25) and the fact that 


Tr[A£(n)j = Vr^n) 


(3.4.26) 


the following vector difference equation for the eigenvalues of k ww (n) is derived: 

ri(n +1) = [l - 4pA + 4p 2 A 2 + 4p 2 A\ T ]ni(n) + 4p 2 k Q (n) A. (3.4.27) 


Since 


lim w(n) = w (3.4.28) 


it follows from (3.4.16) that 


lim k 0 (n) = w H Rw (3.4.29) 


With 


H = 4pA-4p 2 A 2 -4p 2 A.A. T 


(3.4.30) 


(3.4.27) has the solution 


n 

ri(n) = (I - H) n ri(0) + 4p 2 £k 0 (n - i)(I - H) i_1 A. (3.4.31) 

i=l 

where in(0) denotes the eigenvalues of k^O). Since Q diagonalizes k^n), it follows that 

L 

k ww (n)=£n 1 (n)Q 1 Q? (3.4.32) 

1=1 


where Q„ 1 = 1, 2, ..., L are the eigenvectors of PRP. 

Equations (3.4.31) and (3.4.32) completely describe the transient behavior of the weight 
covariance. 
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3.4.6 Convergence of Weight Covariance Matrix 

In this section, the convergence of the weight covariance matrix is examined. Consider 
the following equation: 


ri(n +1) = (I - H)ti(n) + p 2 4k 0 (n)X (3.4.33) 

This represents a set of L difference equations. Before studying the convergence, these 
equations are reduced to a set of L -1 difference equations by showing that one of the 
components in each of the vectors is identical to zero. 

Let A min (.) denote the minimum eigenvalue of a matrix (.). Based on (3.4.22) and A niin (P) = 
0, ^(lhvJn)) - 0. Also, i mm = 0. Let 


XiAX min = 0 (3.4.34) 

and Q, be the eigenvector corresponding to A.,. Since Q diagonalizes k mv (n) and P, Q must 
also be the eigenvector corresponding to the zero eigenvalue of k ww (n) and P. Thus, 

n,(n) = 0 (3.4.35) 

It follows from (3.4.34) and (3.4.35) that the 1th difference equation in (3.4.33) is identical 
to zero. Thus, these reduce to a set of L - 1 difference equations. Define L - 1 dimensional 
vectors X' and T|'(n) such that the ith component is given by 


o;a 


o; 

(•L 


i = l, 2 ,..., 1-1 
i= 1, 1+1,..., L-l 


(3.4.36) 


where (•)' denotes the L-l dimensional vectors X' and r|'( n ), and (.) denotes the corre¬ 
sponding L-dimensional vectors X and r|(n). Similarly, define anL-lxL-1 dimensional 
matrices H' by dropping the column of zeros and the row of zeros from H. 

With A' denoting the diagonal matrix of L -1 nonzero eigenvalues of PRP, it follows 
from (3.4.30) and the definition of the above L-l dimensional vectors that 


H' = 4(1 A' - 4(i 2 A' 2 - 4(i 2 X'X' T 


(3.4.37) 


It follows from (3.4.33) to (3.4.37) that 

r|'(n +1) = (I - H')tl'(n) + 4p 2 k 0 (n)X' (3.4.38) 

It can be shown that limr|'(n) exists under certain conditions (see Appendix 3.6) and is 
given by 

lim m'(n) = 4|i 2 w H RwH'“ 1 X / (3.4.39) 


Substituting for the inverse of IT, 
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,( \ uw H Rw 
lim 11 (n) =- n 

n "“ i v 1 K 
1-^' 


1 - (J.A,; 1 - M-^L-l 


(3.4.40) 


Substituting for the eigenvalues of k ww (n) from (3.4.40) in (3.4.32) yields the steady-state 
expressions for the covariance matrices. 


lim k (n) = 

ww V / 


pw H Rw ^ 1 A Ah 

l - 1 r. 


(3.4.41) 


-ItV 

“f 1 


3.4.7 Misadjustment 

Misadjustment is a dimensionless measure of algorithm performance near the convergence 
point as discussed previously. It is a normalized difference between the adaptive and 
optimal performance of a processor. It is defined as the ratio of the excess mean output 
power to the optimal power, that is. 


M= lim 

n—»°o 


E w H (n)x(n + l)x H (n + l)w(n) -w H Rw 
w h Rw 


(3.4.42) 


Noting that w(n) and x(n + 1) are independent, the expectation over w(n) and x(n + 1) 
in (3.4.42) can be taken independently. Taking the conditional expectation for a given w(n), 
it follows that 


e[w h (n)x(n + l)x H (n +1) w(n)|w(n) j = w H (n)Rw(n) 

= Tr[w(n)w H (n)R] 


(3.4.43) 


Since 


(3.4.44) 


E[w(n)w H (n)] = R ww (n) 

= k ww( n ) + w ( n ) wH ( n ) 

it follows from (3.4.43), after taking unconditional expectation on both sides, that 

E[w H (n)x(n+ l)x H (n+ l)w(n)] = Tr[R ww (n)R] 

= Tr [ k ww( n ) R + w (n)w H (n) R ] (3.4.45) 

= Tr[ k ww(n)R] + w H (n)Rw(n) 

and (3.4.42) becomes 
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(3.4.46) 


Trfk (n)Rl +w H (n)Rw(n)- w h Rw 
M = lim 1 wwV ’ J —- '- 

n— *°° w Rw 

The contribution of the second and third terms in (3.4.46) is zero in the limit because of 
(3.4.8). Since k ivw (n) = Pk^fnlP, it follows that 

Tr[k (n)Rl = TrfPk (n)PRl 
= Tr[k ww (n)PRP] 

= Tr[k ww (n)QAQ H ] (3.4.47) 

= Tr[Q H k ww (n)QA] 

= \ T d(n) 

where 

d(n) = Diag[Q H k ww (n)Q] (3.4.48) 

Thus, (3.4.46) becomes 

M = lim 4 (3.4.49) 

n-»~ W H RW 

Appendix 3.6 proves that for the standard LMS algorithm, if 


0<p< 


2\ 


and 


<> 

rrf 1 - uE 


then the misadjustment is given by 




% 1-Mi, 

Ms= 't, ■' 

tf 


For sufficiently small p, this results in 


(3.4.50) 


(3.4.51) 


(3.4.52) 
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(3.4.53) 


L-l 

i=l 


= pTr(PRP) 


3.5 Perturbation Algorithms 

The LMS algorithm discussed in previous sections requires that the signals on all elements 
are accessible. In some situations this may not be possible. For example, in a large radio 
frequency array it may not be economical to provide a coherent channel on all elements 
in the array and thereby make the required signal inaccessible. In situations like this, one 
needs to estimate the required gradient by other means if the LMS algorithm is to be used 
for weight updating. 

In this section, a method to estimate the required gradient for the LMS algorithm when 
the signals on all elements are not accessible is described using three different processor 
structures. One structure uses a single receiver to measure the power of the processor and 
is referred to as a single-receiver system. The other two structures use two receivers to 
measure the output power, one using dual perturbation and the other using a reference 
receiver. The gradient estimate obtained using three different structures is unbiased 
[Can80]. 

LMS algorithm performance using the gradient estimate by this method can be analyzed 
using an approach similar to that used in previous sections. However, the results on the 
mean and covariance of the gradient, and the covariance of the weights and misadjust- 
ments are stated in this section. The method described in this section is for updating 
weights of the constrained optimal beamformer. The methods applicable to other proces¬ 
sors can easily be developed using a similar approach. 

The method uses orthogonal sequences to perturb the weights of the processor, and 
then measures the output power of the processor to estimate the required gradient. The 
LMS algorithm using the gradient estimated by this method is referred to as perturbation 
algorithm [Can80]. The perturbation algorithm requires more array samples and thus more 
time than the LMS algorithm discussed in previous sections. A weight iteration cycle in 
this case includes a complete weight perturbation cycle occupying, say, M time instants 
to estimate the required gradient. Thus, the weight iteration index and the time index are 
not the same in the perturbation algorithm, as may be the case for standard LMS algorithm. 
Details on the algorithm and its analyses may be found in [Can80, God83, God86, God93]. 

Consider some useful definitions required to understand the material discussed in this 
section. Let S denote a complex vector sequence defined as 


S = {8(1), 8(2),..., 8(M)} 

where 8(1), 1 = 1, 2, ..., M are L-dimensional complex vectors. 

The sequence S is said to be an orthogonal complex vector sequence if 


i M 

-£Re[8(i)]Re[S»(i)] 


= 1 


(3.5.1) 


(3.5.2) 
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(3.5.3) 


i M 

— £ Im [5(i)]lm[8 H (i)] = I 


^£ R e[8(i)]lm[8 H (i)] = 0 (3.5.4) 


and 


^£lm[8(i)]Re[8 H (i)] = 0 (3.5.5) 


The sequence S is said to be of zero mean if 

1 M 

mL 8(0 = 0 (3 ' 56) 

i=l 

and is said to have odd symmetry if for every i, 1 < i < M, there exists a j, 1 < j < M, such 
that 8(i) = -8(j). 

The next section discusses a scheme to generate the required perturbation sequences. 


3.5.1 Time Multiplex Sequence 

Perturbation sequences with the required properties for obtaining an unbiased gradient 
estimate can be constructed in a number of ways. However, for a time multiplex sequence 
it is possible to evaluate certain expressions in closed form. A procedure to construct a 
time multiplex sequence is given below. Let 


h iO) = 


V2L, 

-V2L, 

0 


j = 2i-l 

j = 2i i = 1, 2,..., 2L 

elsewhere in the range 1 < j < 4L 


(3.5.7) 


A multiplex sequence can be defined in terms of h;(j) as follows: 


R e(5i(j)) = hi(j) 

Im ( 5 i(j)) = h i +L (j), 


L, 


j = l, 2,..., 4L 


(3.5.8) 


where 8 ; (j) denotes the ith element of the column vector B(j). 

It can be verified that S has zero-mean odd symmetry and satisfies the required orthog¬ 
onality properties. The time multiplex sequence defined above has length M = 4L and can 
be used to obtain an unbiased gradient estimate for all three structures. However, in the 
case of a dual receiver with dual perturbation, a time multiplex sequence of length M = 
2L can be constructed, which provides an unbiased estimate of the gradient [Can80]. 
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FIGURE 3.2 

Schematic diagram of a single receiver system. 


3.5.2 Single-Receiver System 

In this section, a gradient estimate scheme using a single receiver system is described. 
Figure 3.2 shows a schematic diagram of a single receiver system. The sequence S is used 
to perturb the array weights about their nominal value w(n). The instantaneous output 
power is then correlated with the sequence S and an estimate of the required gradient is 
made. 

At the ith instant within the perturbation cycle, 1 < i < M, the weight vector is given by 

w + (w(n),i) = w(n) + yS(i), l<i<M (3.5.9) 

where y is a real positive scalar and denotes the perturbation step size. An estimate of the 
gradient is given by 


1 M 

8l ( w ( n )) = ymE^' 1 ) 8 ^) (3 ' 5 ' 10) 

* i=l 

where f 1 (w + ,i) is the instantaneous array output power given by 

f 1 (w + ,i) = wf(w(n),i)x(l+i)x H (l+i)w + (w(n),i) (3.5.11) 

and 1 is the time instant at which the perturbation cycle is initiated. 

If the orthogonal perturbation sequence has odd symmetry, then for any y > 0, the 
estimate of the gradient defined by (3.5.10) is unbiased for a given w(n), that is, 

E[g 1 (w(n))|w(n)j = 2Rw(n) (3.5.12) 


3.5.2.1 Covariance of the Gradient Estimate 

Let V g (w(n)) denote the covariance of the gradient estimate defined by (3.5.10). If |x(n)[ 
is an i.i.d. Gaussian sequence, then for the time multiplex perturbation sequence defined 
by (3.5.8), 
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( 3 . 5 . 13 ) 


V 8 l ( W ( n )) = ^ 2 [ 2 L ( dia g ( R )) 2 +^2 ^-{ wH ( n ) Rw ( n )} 21 


- 2diag[Rw(n)w H (n)R] + 2w H (n)Rw(n)diag(R) 


The second and fourth terms in (3.5.13) are proportional to w tl (n)Rw(n) / the quantity that 
the adaptive algorithm is attempting to minimize. Thus, the gradient estimate improves 
as the weight vector approaches the optimum. However, the first and third terms do not 
necessarily decrease. Interestingly, the fourth term is similar to the term in V g fw(n)), the 
covariance of the gradient estimate used in the standard algorithm. The first and second 
terms are penalties due to the use of perturbation for estimating the gradient. The third 
term is due to the mean of the gradient, which is not canceled in the single-receiver system. 


3.5.2.2 Perturbation Noise 

Although the estimated gradient is unbiased and independent of y, the covariance of the 
gradient is a function of y. Furthermore, the presence of perturbations on the weights 
causes an increase in the output power. This power is proportional to y 2 . An indication 
of the effect of the perturbation can be obtained by determining the excess output power, 
referred to as the perturbation noise, £ due to perturbation of weights about a nominal 
weight w(n). 

For any orthogonal sequences S having a zero mean, the excess power due to pertur¬ 
bation about a nominal weight w(n) is given by 


5(y) = 2y z Tr(R) (3.5.14) 

Note from (3.5.13) that V g (w(n)) is a convex function of y, and the optimal value y(w(n)) 
for which V g (w(n)) is minimum can be found. For a time multiplex perturbation sequence, 
the following result can be established. 

Let y(w(n)) represent the value of y(w(n)) for which V & (w(n)) is minimum. Then 


y(w(n)) 


w H (n)Rw(n) 

2Tr(R) 


(3.5.15) 


Let V gi (w(n)) represent the value of V gi (w(n)) at y(w(n)). Then 


V gi (w(n)) = 4w H (n)Rw(n)diag(R) 

+ 2 diag[Rw(n)w H (n)R] 


(3.5.16) 


The perturbation noise when the optimal y is used can be obtained by substituting 
(3.5.15) in (3.5.14). The result is given by 


^(y(n)) = w H (n)Rw(n) 


(3.5.17) 


Assuming that the gradient algorithm converges, then q(y(n)) is approximately given by 


q(y(n)) = w H Rw 


(3.5.18) 
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FIGURE 3.3 

Schematic diagram of a two-receiver system. 


3.5.3 Dual-Receiver System 

In this section, a gradient estimation scheme using a processor with two receivers is 
described. In a two-receiver system, an estimate of the required gradient can be obtained 
by applying a perturbation sequence S in antiphase to the two sets of weights, and 
correlating the difference power from the receivers with S as shown in Figure 3.3 with 
switch position A. Thus, Receiver 1 has its weights perturbed according to 

w + (w(n),i) = w(n) + y8(i), l<i<M (3.5.19) 

and Receiver 2 has its weights perturbed according to 

w (w(n),i) = w(n)-y8(i), l<i<M (3.5.20) 

Let fi(w + ,i) and f 2 (w_,i) denote the instantaneous output power at Receivers 1 and 2, 
respectively. An estimate of the gradient is given by 

8! ( W (n)) = 2yMl[ fl ( Wt ' i )-^ W -, i )] 8 ( i ) (3 ‘ 5 ' 21) 

For a given weight vector w(n), the estimate of the gradient defined by (3.5.21) is unbiased 
for any orthogonal perturbation sequence S. 
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3.5.3.1 Dual-Receiver System with Reference Receiver 

In a two-receiver system, an estimate of the required gradient can also be obtained by 
using a perturbation sequence S to perturb the array weights of only one of the receivers 
about their nominal value w(n), while the other receiver has its weights fixed at w(n) as 
shown in Figure 3.3 with switch position B. Let Receiver 1 have its weights perturbed by 
a sequence S so that its weight vector is given by 

w + (w(n),i) = w(n) + yS(i), l<i<M (3.5.22) 

Let fi(w + ,i) and f 2 (w,i) denote the output power of receivers 1 and 2, respectively. An 
estimate of the gradient is given by 


1 M 

g 3 ( w ( n )) = f 2( w h)] s (i) (3.5.23) 

' i=l 

The estimate of the gradient defined by (3.5.23) is unbiased when S is an orthogonal 
perturbation sequence and has odd symmetry. 

3.5.3.2 Covariance of Gradient 

For two-receiver systems, the following result can be established. Let V go (w(n)) and 
V g3 (w(n)) denote the covariance of the gradient estimated by (3.5.21) and (3.5.23), respec¬ 
tively. If jx(n)} is an i.i.d. Gaussian sequence, then for the time multiplex perturbation 
sequence defined by (3.5.8), 


and 


V fc (w(n)) = 2w H (n)Rw(n)diag(R) 


(3.5.24) 


V fo (w(n)) = y 2 ^2L(diag(R)) 2 + 2w H (n)Rw(n)diag(R) 


(3.5.25) 


V g2 (w(n)) and the second term for V fe (w(n)) are proportional to w H (n)Rw(n), the quantity 
that the adaptive algorithm is attempting to minimize. Thus, the gradient estimate 
improves as the weight vector approaches the optimal value. 


3.5.4 Covariance of Weights 

It can be established that the weight covariance matrix is diagonalizable when the cova¬ 
riance of the gradient used for updating w(n) is V g (w(n)) or V g (w(n)). Thus, for these 
two cases an analysis of the weight covariance matrix is possible by developing matrix 
and vector difference equations using the scheme presented in Section 3.4. The results on 
the transient and steady-state behavior of this matrix for the two cases are presented in 
this section [God86]. 

The weight covariance matrix is not diagonalizable when the covariance of the gradient 
used for updating w(n) is V gi (w(n)). Consequently, it is not possible to describe the 
transient and the steady-state behavior of the weight covariance matrix for the single¬ 
receiver system using the scheme presented in Section 3.4. 
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3.5.4.1 Dual-Receiver System with Dual Perturbation 

Substituting V g (w(n)) for V g (w(n)) in (3.4.14), and following a procedure similar to that 
used in Section 3.4.5, the following matrix difference equation is derived: 


E(n +1) = (i - 4pA + 4p 2 A 2 )E(n) 

+ p 2 ^Tr(R)[Tr(AE(n)) + k 0 (n) 


(3.5.26) 


where T is a diagonal matrix with its diagonal elements being the eigenvalues of P. 

Let an L-dimensional vector {f denote the L eigenvalues of P and an L-dimensional 
vector r) 2 (n) denote the L eigenvalues of the weight covariance matrix k lvW2 (n) when the 
covariance of the gradient used is V g (w(n)). Since Tr(A E(n) = X'rijfn), (3.5.26) reduces to 
the following vector difference equation: 


tl 2 (n + l) = 


I - 4pA + 4p 2 A 2 + p 2 - Tr(R)fl\ T 


il 2 (n) + p 2 r Tr(R)k 0 (n)d (3.5.27) 


With 


H 2 = 4pA - 4p 2 A 2 - ^ Tr (R)p 2 -aX T 
the solution of (3.5.27) is given by 


(3.5.28) 


Y l 2 (n) = (l-H 2 ) n 'n 2 (0) + p 2 


2 

L 


Tr(R)£k 0 (n-i)(l 

i=l 


h 2 r« 


(3.5.29) 


and k^n) is given by 


k WW2 (n) = ^h 21 (n)Q_Qf (3.5.30) 

1=1 


where t| 2 (0) is the vector of eigenvalues of k WW2 (0) and Q„ 1 = 1,2,..., L are the eigenvectors 
of PRP Equations (3.5.29) and (3.5.30) completely describe the transient behavior of the 
weight covariance matrix. 

The steady-state expression for the weight covariance matrix is obtained by substituting 
the steady-state value of L - 1 nonzero eigenvalues of the weight covariance matrix. The 
steady-state expression for these eigenvalues is given by 


lim r| 2 (n) 


pI^^Rw 

2L 


1-h 


Tr(R) 

2L 


Y— 


-gA,; 




(3.5.31) 
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3.5.4.2 Dual-Receiver System with Reference Receiver 

Substituting V g (w(n)) for V g (w(n)) in (3.4.14), and following a procedure similar to that 
used in Section 3.4.5, the following matrix difference equation is derived: 


E(n +1) = (i - 4pA + 4p 2 A 2 )£(n) + p 2 (2/L)(y 2 (Tr(R)) 2 
+ Tr(R)(Tr(AE(n)) + k 0 (n))r 


(3.5.32) 


Denoting the eigenvalues of the weight covariance matrix k^fn) by an L-dimensional 
vector q 3 (n) when the covariance of the gradient used is V fe (w(n)), (3.5.32) yields the 
following vector difference equation: 


h 3 ( n + l) = 


I - 4pA + 4p 2 A 2 + p 2 ^ Tr(R)fl\ T 


T1 3 ( n ) 


+ p 2 ^Tr(R)k 0 (n)d + p 2 ^ y 2 [Tr(R)] 2 -a 


(3.5.33) 


which has the solution 


%( n ) = ( I - H 2 ) nir l 3 ( 0 ) 

+ p 2 ^Tr(R)£(k 0 (n - i) + y 2 Tr(R))(l - H 2 f ^ 


(3.5.34) 


and k^fn) is given by 


L, 

k ww 3 ( n )=£ r l3l( n )QlQl 


(3.5.35) 


where ki 3 (0) is the vector of eigenvalues of k WW3 (0). 

The steady-state behavior is obtained by substituting the steady-state expression for L - 1 
nonzero eigenvalues given by 


lim r^n) 


|w h RW + y 2 Tr(R)] 


1-p 


Tr(R) y 1 
2L “ 1 - pX; 



1 


l 


i-p^i 

.Vi 

1 — 


(3.5.36) 


It could not be established that the weight covariance matrix is diagonizable when the 
gradient is estimated using the single-receiver system, and thus an analysis of the behavior 
of the weight covariance matrix is not possible. However, some results on the misadjust- 
ment for this case are presented along with two other cases. 
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3.5.5 Misadjustment Results 

In this section, exact expressions for the misadjustment are presented for the dual receiver 
system and bounds on the misadjustment are presented for the single-receiver case. 

3.5.5.1 Single-Receiver System 

For a single-receiver system with 


y(w(n)) = i 


w H (n)Rw(n) 
2Tr(R) 


iV 2 


0<p< 


a L-l 


41 L 


|Tr(R) + 1.5A. n 


then the misadjustment is bound by 


where 


b L1 <Mi<b H1 


b Ll = 


at L-l 


41 L 


|Tr(R) + 0.5(L-l)w H Rw 


¥i Tr(R) 


and 


b Hl = 


a f L-l 


41 L 


|Tr(R) + 0.5(L - 1 )w h Rw 


1-p 


a f L-l 


41 L 


|Tr(R)+ 1.5A, n 


a= c + 


Note that c = 1 corresponds to the optimal y. 

3.5.5.2 Dual-Receiver System with Dual Perturbation 

If, for a given w(n), the covariance of the gradient is given by 

v g (w(n)) = v g7 (w(n)) 


0<p< 


^max + Tr(R)/2L 


(3.5.37) 


(3.5.38) 


(3.5.39) 


(3.5.40) 


(3.5.41) 


(3.5.42) 


(3.5.43) 

(3.5.44) 
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and 


M-Tr(R) v~* t : 
2L 

then the misadjustment is given by 


M, = 


M-Tr(R) Y"' 1^ 

2L 

1 h Tr (R) y l 

2L ^1-A 


3.5.5.3 Dual-Receiver System with Reference Receiver 

If, for a given w(n), the covariance of the gradient is 

V(w(n)) = V 3 ( w (n)) 

1 

0 < Li < ^- 

^max + Tr(R)/2L 

and 


hTr(R) y^ 1^ 
2L Af 1 - J-tA-i 

then the misadjustment is given by 


\ , ..2 Tr ( R ) l hTr ( R) y 1 ^ 

w h RwJ 2L j-fl -pA, ; 

1 M-Tr(R) y 1 
2L ifl-pX, 


(3.5.45) 


(3.5.46) 


(3.5.47) 

(3.5.48) 


(3.5.49) 


(3.5.50) 


3.6 Structured Gradient Algorithm 

In this section, a description and an analysis of the constrained LMS algorithm is presented 
when it uses a gradient estimated from an estimate of the array correlation matrix having 
a special structure [God89, God90, God93, God97]. The algorithm for this case is referred 
to as the structured gradient algorithm. 
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The gradient estimate given by (3.4.4) for the standard constrained LMS algorithm can 
be expressed as 

g s (w(n)) = 2R(n + l)w(n) (3.6.1) 

where 

R(n) = x(n)x H (n) (3.6.2) 


is a noisy sample of the array correlation matrix at the nth instant of time estimated from 
only one array sample. 

For a uniformly spaced linear array, the array correlation matrix R has the Toeplitz 
structure, that is. 


R = 


n 

M 


V r L-i 


L 

O 


o J 


(3.6.3) 


where r s , i = 0,1, ..., L - 1 denote the correlation between elements with lag i, defined as 


r i= E [ x m( n )>C + i( n )]' 


ji = 0,1,..., L-l 
[m = 1, 2,..., L 


(3.6.4) 


and x m (n) denotes the signal derived from mth element at the nth time instant. 

Note that not all combinations of m and i are possible in (3.6.4), as there are only L 
elements in the array. For i = 0, m = 1, ...,L yielding L values of r 0 . These values form the 
main diagonal of R in (3.6.3). For i = 1, m = 1,2,..., L - 1 results in L - 1 values of r v These 
values form the second diagonal of R and so on. 

The noisy sample of R used in estimating the gradient for the standard algorithm does 
not have the Toeplitz structure. The structured gradient algorithm exploits this structure 
of the array correlation matrix in estimating the gradient. It takes R(n) and estimates an 
array correlation matrix R having the Toeplitz structure. The structured array correlation 
matrix R is then used in gradient estimation as discussed below. 


3.6.1 Gradient Estimate 

For this algorithm, the gradient estimate is defined as follows: 


g s t( w ( n )) = 2E ( n+1 ) w ( n ) (3-6.5) 

where R(n) is an estimate of the array correlation matrix at the nth instant of time having 
Toeplitz structured as in (3.6.3), and is given by 


R(n) 


f 0 ( n ) h( n ) f L-l( n ) 

h(n) 0 

M 0 

f L-l( n ) f o( n ) 


(3.6.6) 
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with 


?1 ( n ) = N 52 X '( n ) X '*+i( n )' 1=0 ' 1/ ■■■/ L-l 


(3.6.7) 


where N, denotes the number of possible combinations of elements with lag 1 and sum¬ 
mation is over all these combinations. For a linear array of equispaced elements, N[ = L -1. 
Since 


E [h( n )] = p, 1=0,1,..., L-l 
it follows from (3.6.7), (3.6.6), and (3.6.3) that 

E[R(n)] = R 


(3.6.8) 


(3.6.9) 


Thus, for a given w(n). 


E [g st ( w ( n ))] = 2Rw(n) (3.6.10) 

and the gradient estimate is unbiased. 

The discussion on the structured gradient algorithm presented here is for an equispaced 
linear array. The formulation can easily be extended to an arbitrary array. 

For the equispaced linear array, each element of R(n) is a mean value of all elements of 
R(n) with the same spatial correlation lags. Thus, r 0 (n) is an average of the main diagonal 
elements of R(n), r x (n) is the mean of first diagonal elements of R(n), and so on. For an 
array that is not an equispaced linear array, the array correlation matrix loses its Toeplitz 
structure, and the number of elements in R with the same spatial correlation lags is less 
in comparison to the equispaced case. Flowever, there are always some elements in R with 
the same spatial correlation lags. Even in a completely unstructured correlation matrix, 
such as would be obtained from a three-element array with spacing d and 2d, the diagonal 
elements are always of the same correlation lag, namely lag 0. 


3.6.2 Examples and Discussion 

Examples are presented in this section to compare the performance of the structured 
gradient algorithm and the standard algorithm. The mean noise power for a given w(n) 
is examined as a function of the weight update iteration to see the algorithm's effectiveness 
in reducing noise. 

Figure 3.4 to Figure 3.8 shows the plots of the mean noise power in dB for a given w(n) 
vs. the iteration number. The mean noise power for a given w(n) is calculated using 

P N (w(n)) = w H (n)Rw(n) (3.6.11) 

A linear array of ten elements with half-wavelength spacing is assumed for these 
examples. The look direction is assumed to be in the broadside of the array. The power 
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FIGURE 3.4 

101ogP N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. The 
curve with a solid line is for a structured gradient algorithm, and the one superimposed with a circle is for a 
standard LMS algorithm. Two interferences: 0, = 98°, pj = 1, 0 2 = 45°, p 2 = 10, oj = 0.1, look direction angle 9 0 = 
90°, p s = 1, Use = |i sx = .0005. (From Godara, L.C. and Gray, D.A., /. Acoust. Soc. Am., 86,1040-1046,1989 [God89]. 
With permission.) 


of uncorrelated noise present on each element is assumed to be equal to 0.1. Two inter¬ 
ference sources are assumed to be present. Directions of these interferences make angles 
of 98° and 45° with the line of the array. The other parameters are included in figure 
captions. The gradient algorithm is initialized with the conventional weight. The gradient 
step size for the standard algorithm and the structured gradient algorithm are denoted 
by (igy and pgQ, respectively. A comparison of the two algorithms in Figure 3.4 reveals that 
the noise in the weights estimated by the structured gradient algorithm is much less than 
that estimated by the standard algorithm. 

Figure 3.5 compares the two algorithms when the signal power is reduced by 20 dB 
compared to the scenario of Figure 3.4. Comparing Figure 3.4 and Figure 3.5, one observes 
that the fluctuations in the mean output noise power in Figure 3.4, where the signal power 
is 1, are more than in Figure 3.5 where the signal power is 0.01. Thus, the noise in the 
weights estimated by the standard algorithm depends on the input signal power, and 
increases as the signal power increases. On the other hand, the structured gradient algo¬ 
rithm does not appear to be sensitive to the signal power. The signal sensitivity of the two 
algorithms is further compared in Figure 3.6 and Figure 3.7, where the power of the second 
interference is increased by 10 dB. Sensitivity of the standard algorithm to the input signal 
level is clearly visible from the two figures. The noise fluctuation in weights estimated by 
standard LMS algorithm is more in Figure 3.7 where the signal power is 1.0 than that in 
Figure 3.6 where the signal power is .01. 
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FIGURE 3.5 

101ogP N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. The 
curve with a solid line is for a structured gradient algorithm, and the one superimposed with a circle is for a 
standard LMS algorithm. Two interferences: 6j = 98°, pj = 1, 0 2 = 45°, p 2 = 10, oj = 0.1, look direction angle 0 O = 
90°, p s = .01, p SG = p sx = .0005. (From Godara, L.C. and Gray, D.A., J. Acoust. Soc. Am., 86,1040-1046,1989 [God89]. 
With permission.). 


The noise in the weights estimated by the standard LMS algorithm can be reduced by 
using a smaller value of the gradient step size. The reduction in the step size to reduce 
the noise in weights means the reduction in the convergence speed of the algorithm as 
shown in Figure 3.8, where the step size used for the standard algorithm is one-tenth of 
that used in the structured gradient algorithm. For this case, the structured gradient 
converges faster than the standard algorithm. 

It should be noted that the gradient estimate using the structured method requires more 
computation than the standard method. In the standard algorithm, an estimate of the 
gradient requires an order of L complex multiplications, whereas, for structured gradient 
algorithm, it requires the order of L 2 complex multiplications. A detailed discussion on 
the signal sensitivity of the LMS algorithms is presented in Section 3.14. 


3.7 Recursive Least Mean Squares Algorithm 

The recursive LMS algorithm uses all previous array samples to estimate the gradient, in 
comparison to the standard LMS algorithm, which uses only one array sample [God90a, 
God93]. In this section, the behavior of the recursive LMS algorithm is examined by 
deriving an expression for the covariance of the gradient. 
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FIGURE 3.6 

101ogP N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. The 
curve with a solid line is for a structured gradient algorithm, and the one superimposed with a circle is for a 
standard LMS algorithm. Two interferences: 0j = 98°, pj = 1, 0 2 = 45°, p 2 = 100, Oj, = 0.1, look direction angle 0 O = 
90°, p s = .01, 1152 = p ST = .00005. (From Godara, L.C. and Gray, D.A., ]. Acoust. Soc. Am., 86 , 1040-1046, 1989 
[God89]. With permission.) 


3.7.1 Gradient Estimate 

Let g R (w(n)) denote the estimated gradient by recursive method for a given w(n), defined 
as 


g R ( w ( n )) = 2 R( n+1 ) w ( n ) (3-7.1) 

where 

R(n + 1 )= "hn) + x(n + l)x"( n+ 1 ) (3 . 7 . 2) 

It follows from (3.7.2) that as the number of samples used in estimating the array corre¬ 
lation matrix increases, the matrix estimate approaches the true correlation matrix. Thus, 

lim R(n) = R (3.7.3) 


and 
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FIGURE 3.7 

101ogP N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. The 
curve with a solid line is for a structured gradient algorithm, and the one superimposed with a circle is for a 
standard LMS algorithm. Two interferences: 0j = 98°, pj = 1, 0 2 = 45°, p 2 = 100, a* = 0.1, look direction angle 0 O = 
90°, p s = 1, me = |i ST = .00005. (From Godara, L.C. and Gray, D.A., /. Acoust. Soc. Am., 86,1040-1046,1989 [God89]. 
With permission.) 


limg R (w(n)) = 2Rw(n) (3.7.4) 


Consequently, the gradient estimate approaches the true gradient as n —> °°. 


3.7.2 Covariance of Gradient 

In this section, covariance of the gradient is established. The result is valid for large n 
samples, such that 


R(n) = R (3.7.5) 

It follows from (3.7.1), (3.7.2), and (3.7.5) that 

g R ( w ( n )) = R w(n) + —^-jx(n + l)x H (n + l)w(n) (3.7.6) 

Let V gR (w(n)) denote the covariance of the gradient estimate defined by (3.7.1) and (3.7.2) 
for a given w(n). By definition. 
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FIGURE 3.8 

101ogP N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. The 
curve with a solid line is for a structured gradient algorithm, and the one superimposed with a circle is for a 
standard LMS algorithm. Two interferences: 9j = 98°, pj = 1, 0 2 = 45°, p 2 = 100, a* = 0.1, look direction angle 0 O = 
90°, p s = 1, Use = .00005, |t ST = .000005. (From Godara, L.C. and Gray, D.A., J. Acoust. Soc. Am., 86, 1040-1046, 
1989 [God89]. With permission.) 


V g R ( W ( n )) = E [gR( W ( n ))gR( W ( n ))]-g R ( W ( n ))gR( W ( n )) ( 3 ' 7 ' 7 ) 

where g R (w(n)) is the mean value of the gradient estimate for a given w(n). 

It follows from (3.7.1) and (3.7.2) that 

g R ( w ( n )) = 2Rw ( n ) (3-7.8) 

Thus, 

g R ( w ( n ))g R ( w ( n )) = 4Rw(n)w H (n)R (3.7.9) 

Using the following result for an i.i.d. complex Gaussian sequence jx(k)} and a Hermitian 
matrix A, 

E[x(n)x H (n)Ax(n)x H (n)] = RAR + Tr(RA)R (3.7.10) 

the following is derived from (3.7.6) 


E [g R ( w ( n ))g R ( w ( n ))] = 4 Rw(n)w H (n)R + 


4 


w H (n)Rw(n)R 


(3.7.11) 
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Substituting in (3.7.7) from (3.7.9) and (3.7.11), 


V g R ( w ( n ))= w H (n)Rw(n)R (3.7.12) 

It follows from (3.7.12) that the covariance of the estimated gradient by the recursive 
method decreases as the iteration number increases and (n + l) 2 times less than the 
covariance of the gradient estimated by the standard method. The covariance of the 
gradient estimated by the standard method V gs (w(n)), is given by 

V fe (w(n)) = 4w H (n)Rw(n)R (3.7.13) 


3.7.3 Discussion 

As discussed previously, the projected covariance of the gradient PV g (w(n))P affects the 
weight covariance. Taking the projection on both sides of (3.7.12) and (3.7.13), and noting 
that PRP is independent of the look direction signal, one observes that the projected 
covariance in both the cases is proportional to the mean output power. This implies that 
for both the cases the projected covariance is a function of the look direction signal. This 
in turn makes the weight covariance at each iteration sensitive to the look direction signal. 
However, at the nth iteration, the weight covariance for the recursive algorithm case is 
less than that for the standard LMS case due to the term (n + l) 2 in (3.7.12). 


3.8 Improved Least Mean Squares Algorithm 

The structured gradient algorithm exploits the structure of the array correlation matrix. 
However, it does not make use of the previous samples when estimating the gradient at 
the nth iteration. In this section, a method is presented that exploits the structure of the 
array correlation matrix and uses previous samples. The method is referred to as the 
improved method [God90a, God93]. 

An estimate of the gradient using the improved method is given by 


g : (w(n)) = 2R(n + l)w(n) 


where 


R(n + 1) 


nR(n) + R(n+l) 
n+1 


(3.8.1) 


(3.8.2) 


with R(n) given by (3.6.6). 

It can easily be shown that the gradient estimate is unbiased, that is, 

E[gj(w(n))|w(n)j = 2Rw(n) (3.8.3) 
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FIGURE 3.9 

P N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. Two interfer¬ 
ences: 0j = 98°. pj = 1, 0 2 = 72°, p 2 = 100, oj = 0.1, look direction angle 0 O = 90°. (From Godara, L.C., IEEE Trans. 
Antennas Propagat., 38, 1631-1635, 1990. ©IEEE. With permission.) 


The performance and the signal sensitivity of the above algorithm is now compared 
with a RLS algorithm that makes use of the previous samples and requires the same order 
of computation for computing the weights. The following form of the RLS algorithm is 
used for the comparison: 


w(n) 


R»s 0 

s?R»s 0 


(3.8.4) 


where R -1 (n) is updated using the Matrix Inversion Lemma and is given by (3.1.6) and 
(3.1.7). Note that in the absence of errors, n —» R -1 (n) —» R -1 , and w(n) —» w. 

Figure 3.9 to Figure 3.12 compare the mean output noise power P K (w(n)) vs. the iteration 
number for various look direction signal powers when the weights w(n) are adjusted using 
the two algorithms. The mean output noise power is calculated using 


p N ( w ( n )) = wH ( n ) R N w ( n ) (3-8.5) 

A linear array of ten elements with half-wavelength spacing is assumed for these 
examples. The variance of uncorrelated noise present on each element is assumed to be 
equal to 0.1. Two interference sources are assumed to be present. The first interference 
falls in the main lobe of the conventional array pattern and makes an angle of 98° with 
the line of the array. The power of this interference is taken to be 10 dB more than the 
uncorrelated noise power. The second interference makes an angle of 72° with the line of 
the array and falls in the first side-lobe of the conventional pattern. The power of this 
interference is 30 dB more than the uncorrelated noise power. The look direction is broadside 
to the array. The signal power for the four plots is varied from -10 dB below the uncor¬ 
related power to 30 dB above the uncorrelated noise power. The gradient algorithm is 
initialized with the conventional weights. 
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Iteration Number 


FIGURE 3.10 

P N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. Two interfer¬ 
ences: 9j = 98°, pj = 1, 0 2 = 72°, p 2 = 100, oj = 0.1, look direction angle 9 0 = 90°. (From Godara, L.C., IEEE Trans. 
Antennas Propagat., 38, 1631-1635, 1990. ©IEEE. With permission.) 



Iteration Number 


FIGURE 3.11 

P N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. Two interfer¬ 
ences: 9j = 98°, pj = 1, 0 2 = 72°, p 2 = 100, oj = 0.1, look direction angle 9 0 = 90°. (From Godara, L.C., IEEE Trans. 
Antennas Propagat., 38, 1631-1635, 1990. ©IEEE. With permission.) 

For the improved LMS algorithm the gradient step size ji is taken to be equal to 0.00005 
and for the RLS algorithm £ 0 is taken to be 0.0001. According to these figures, for a weak 
signal the RLS algorithm performs better than the improved algorithm. However, as the 
input signal power increases the output noise power of the processor using the RLS 
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FIGURE 3.12 

P N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. Two interfer¬ 
ences: 0j = 98°. pj = 1, 0 2 = 72°, p 2 = 100, oj = 0.1, look direction angle 0 O = 90°. (From Godara, L.C., IEEE Trans. 
Antennas Propagat., 38, 1631-1635, 1990. ©IEEE. With permission.) 

algorithm increases. Thus, the RLS algorithm used in the present form is sensitive to the 
look direction signal. On the other hand, this is not the case for the improved LMS 
algorithm. Performance of the improved LMS algorithm improves as the signal power is 
increased, and in the presence of a strong signal it performs much better than the RLS 
algorithm, both in terms of convergence and the output SNR. See for example, the plots 
in Figure 3.12 where the input signal power is 30 dB more than the uncorrelated noise 
power. 

Figure 3.13 compares the performance of the standard LMS algorithm, recursive LMS 
algorithm, and improved LMS algorithm. The noise field and array geometry used for 
this example are the same as those used in previous examples. The input signal power is 
30 dB more than the uncorrelated noise power and the gradient step size is 0.00005. It is 
clear from Figure 3.13 that the output noise power of the processor at each iteration is less 
when the recursive algorithm and the improved algorithm are used in comparison to the 
output noise power using the standard algorithm. A large fluctuation in the output of the 
processor using the standard algorithm in comparison to the other two algorithms indi¬ 
cates the sensitivity of this algorithm to the look direction signal. A comparison of the 
recursive LMS and improved LMS show that the latter performs better, both in terms of 
the amount of the noise and its variation as a function of iteration number. 


3.9 Recursive Least Squares Algorithm 

The convergence speed of the LMS algorithm depends on the eigenvalues of the array 
correlation matrix. In an environment yielding an array correlation matrix with large 
eigenvalue spread the algorithm converges with a slow speed. This problem is solved 


© 2004 by CRC Press LLC 








FIGURE 3.13 

P N (w(n)) vs. the iteration number for a 10-element linear array with one-half wavelength spacing. Two interfer¬ 
ences: 0j = 98°, pj = 1, 0 2 = 72°, p 2 = 100, oj = 0.1, look direction angle 0 O = 90°. (From Godara, L.C., IEEE Trans. 
Antennas Propagat., 38, 1631-1635, 1990. ©IEEE. With permission.) 

with the RLS algorithm [Sch77, d'As80, God97] by replacing the gradient step size (1 with 
a gain matrix R -1 (n) at the nth iteration, producing the weight update equation 


w(n) = w(n -1) - R *(n) x(n)e* (w(n -1)) 

where R(n) is given by 

R(n) = 8 0 R(n -1) + x(n) x H (n) 

= £8- k x(k) x H (k) 

k=0 

with 8 0 denoting a real scalar less than but close to 1. The 8 0 is used for exponential 
weighting of past data and is referred to as the forgetting factor as the update equation 
tends to de-emphasize the old samples. The quantity 1 /(I - 8 0 ) is normally referred to as 
the algorithm memory. Thus, for 8 0 = 0.99 the algorithm memory is close to 100 samples. 
The RLS algorithm updates the required inverse of using the previous inverse and the 
present sample as 


(3.9.1) 


(3.9.2) 


R» 


R 1 ( n -l) 


R _1 (n-l) x(n) x H (n) R _1 (n-1) 
8 0 + x H (n) R^(n-l) x(n) 


The matrix is initialized as 


(3.9.3) 


R- 1 (0) = —I, e 0 >0 
e o 


(3.9.4) 
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The RLS algorithm minimizes the cumulative square error 


J(n) = ^8r k |e(k)| 2 (3.9.5) 

k=0 


and its convergence is independent of the eigenvalues distribution of the correlation 
matrix. 

The algorithm presented here is the exact RLS algorithm. Other forms of the RLS 
algorithm with improved computation efficiency are available [Fab86, Cio84], A compar¬ 
ison of the convergence speed of the LMS, RLS, and some other gradient-based algorithms 
using quantized or clipped data indicates that RLS is the most efficient and LMS is the 
slowest [Gar87]. 

Computer simulation study of RLS, LMS, and SMI algorithms in mobile communication 
situations suggests that the former outperforms the latter two in flat fading channels 
[Fer93]. An application of the RLS algorithm for the reverse link of a cellular communi¬ 
cation using the CDMA system is considered in [Wan94] to show an increase in channel 
capacity by an adaptive array. 


3.10 Constant Modulus Algorithm 

The constant modulus algorithm is gradient based [God97] and works on the premise that 
existing interference causes fluctuation in the amplitude of array output that otherwise 
has a constant modulus. It updates weights by minimizing the cost function [Chi93, God80, 
Tre83, Shy93] 


J(n) = 


1 

2 



using the following equation 


w(n +1) = w(n) - p g(w(n)) 


where 


y (n) = w H (n) x(n +1) 


(3.10.1) 


(3.10.2) 


(3.10.3) 


is the array output after the nth iteration, y 0 is the desired amplitude in the absence of 
interference, and g(w(n)) denotes an estimate of the cost function gradient. 

Similar to the LMS algorithm discussed previously, the constant modulus algorithm 
uses an estimate of the gradient by replacing the true gradient with an instant value given 
by 


g(w(n)) = 2e(n) x(n+l) 


(3.10.4) 
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where 


e ( n ) = (|y ( n )| 2 - Yo) y( n ) 


(3.10.5) 


The weight update equation for this case becomes 

w(n +1) = w(n) - 2pe(n) x(n +1) (3.10.6) 

In appearance, this is similar to the LMS algorithm with reference signal where 

e(n) = r(n) - y(n) (3.10.7) 

Its application to digital, land mobile radio communication systems using TDMA to 
compensate for selective fading is studied in [Ohg91]. Discussions of hardware implemen¬ 
tation of a CMA adaptive array and its BER performance for high-speed transmission in 
mobile communications may be found in [Ohg93a, Ohg93]. Development of CMA for 
beam-space array signal processing including its hardware realization has been reported 
in [Tan95]. The results presented in [Chi93] indicate that the beam space CMA is able to 
cancel interferences arriving from other than the look direction. 

CMA is useful for eliminating correlated arrivals and is effective for constant modulated 
envelope signals such as GMSK and QPSK, which are used in digital communications. 
However, the algorithm is not appropriate for the CDMA system because of the required 
power control [Wan94]. Use of CMA to blindly separate co-channel FM signals in mobile 
communications has been investigated in [Par95]. A variation referred to as differential 
CMA reported in [Nis95] has inferior convergence characteristics compared to CMA but 
may be improved using direction of arrival information to make it operative in beam space. 


3.11 Conjugate Gradient Method 

An application of the conjugate gradient method [Hes52, Dan67, Sar81] to adjust the 
weights of an antenna array is discussed in [Cho92]. The method is generally useful for 
solving a set of equations of the form Aw = b to obtain w. In this section, a brief description 
of the CGM is provided [God97]. 

For an array-processing problem, w denotes the array weights, A is a matrix with each 
of its columns denoting consecutive samples obtained from array elements, and b is a 
vector containing consecutive samples of the desired signal. Thus, a residual vector 

r = b - Aw (3.11.1) 

denotes error between the desired signal and array output at each sample, with the sum 
of the squared error given by r H r. 

The method starts with an initial guess w(0) of the weights, obtains a residual 

r(0) = b-Aw(0) (3.11.2) 
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an initial direction vector 


g(0)=A H r(0) 


(3.11.3) 


and moves the weights in this direction to yield a weight update equation. 


w(n +1) = w(n) - p(n) g(n) 


(3.11.4) 


where the step size is 


p(n) = 


A H r(n)| 2 

A H g(n) 2 


The residual r(n) and the direction vector g(n) are updated using 


(3.11.5) 


and 


with 


r(n + l) = r(n) + p(n) Ag(n) 
g(n +1) = A H r(n +1) - a(n)g(n) 


A H r(n +1) 
A H r(n) 


(3.11.6) 

(3.11.7) 


(3.11.8) 


The algorithm is stopped when the residual falls below a certain predetermined level. 
It should be noted that the direction vector points in the direction of error surface gradient 
r H (n)r(n) at the nth iteration, which the algorithm is trying to minimize. The method 
converges to the error surface minimum within at most L iterations for an L-rank matrix 
equation, and thus provides the fastest convergence of all iterative methods [Cho91, 
Cho92], 

Use of the conjugate gradient method to eliminate multipath fading in mobile commu¬ 
nication situations has been studied in [Cho92] to show that the BER performance of the 
system using the conjugate gradient method is better than that using the RLS algorithm. 


3.12 Neural Network Approach 

In this section, a neural-network base algorithm to estimate the weights of an adaptive 
array system is described [God97]. For discussion on various aspects of this algorithm, 
referred to as Madaline rule III (MRIII), as well as other related issues, see [Wid90]. For 
general theory of neural networks and their applications, see [Fau90, Gel96]. 

The MRIII algorithm described here is applicable when the reference signal is available 
and minimizes the MSE between the reference signal and the modified array output, rather 
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than the MSE between the reference signal and the array output, as is the case for other 
algorithms discussed previously. The array output is modified using a nonlinear mapping, 
such as hyperbolic tangent 

tanh(x) = ^|^ (3.12.1) 

and the weights are updated using 

w(n + l) = w(n)-pg(w(n)) (3.12.2) 

where p is the gradient step size and g(w(n)) is the instant gradient of the MSE surface 
with respect to the array weights w(n). 

When the array is operating with weights w(n), producing the array output 

y(n) = w H (n)x(n +1) 

the modified output y(n) becomes 

y(n) = tanh(y(n)) 

and the resulting error signal is given by 

e(n) = y(n) - r(n +1) (3.12.5) 


(3.12.3) 

(3.12.4) 



g(w(n)) = 2e(n)^*^—^x(n+1) (3.12.7) 

where Ae(n) denotes the change in the error output when the array output is perturbed 
by a small amount of Ay and could be measured to estimate the instant gradient. The 
weight update equation then becomes 
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w(n +1) = w(n) - 2pe(n) x (n +h 

Ay 


(3.12.8) 


The MSE surface of the error signal e(n) may have local minimization and thus global 
convergence of the MRIII algorithm is not guaranteed, which is not the case when MSE 
between the reference signal and the array output is minimized [Wid90]. The algorithm, 
however, is very robust, suitable for analog implementation, and results in fast weight 
updates. 

The MRIII algorithm described here is suitable when the reference signal is available. 
A scheme to solve constrained beamforming problems using neural networks is analyzed 
in [Cha92], and its implementation using switched capacitor circuits is described in 
[Yan96]. Computer simulations and experimental results indicate the suitability of the 
scheme. 


3.13 Adaptive Beam Space Processing 

In this section, an adaptive algorithm to estimate the weights of the two-beam processor 
referred to as postbeamformer interference canceler (PIC), and discussed in Section 2.6.3 
is presented and its performance is analyzed [God89a]. The analyses include the transient 
and steady-state behavior of the weights. The structure of the processor is shown in Figure 
2.11. These results can be generalized for a general multibeam processor. 

Rewrite (2.6.43) to (2.6.46) in discrete notation: 


\|/(n) = V H x(n) 

(3.13.1) 

q( n ) = U H x(n) 

(3.13.2) 

y(n) = \|/(n)-wq(n) 

(3.13.3) 

and 


P(w) = V h RV + w*wU h RU - w*V h RU - wU h RV 

(3.13.4) 

where \j/(n) denotes the output of signal beam; q(n) denotes the output of the interference 
beam; y(n) denotes the output of the processor; P(w) denotes the mean output of the 
processor for a given w; V and U, respectively, denote the fixed weights of the signal- 
beam and the interference beam; and w is the weight applied to the interference beam 

output. 

Let w denote the optimal weight that minimizes P(w). From (2.6.48) it is 

given by 

, V h RU 

W = —n- 

u h ru 

(3.13.5) 


Define a real-time algorithm for determining the optimal weight w as 
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w(n +1) = w(n) - jrg(w(n)) 


(3.13.6) 


where w(n + 1) denotes the new weight computed at the (n + l)th iteration, p. is a positive 
scalar defining the step size, and g(w(n)) is an unbiased estimate of the gradient of P(w(n)) 
with respect to w. 

It follows from (3.13.4) that the gradient of P(w(n)) with respect to w is given by 

VP(w(n))| = 2w(n)U H RU-2V H RU (3.13.7) 

w x -'lw-w(n) 


3.13.1 Gradient Estimate 

A suitable estimate of the gradient of P(w(n)) with respect to w is given by 


g(w(n)) = -2y(n)q*(n) (3.13.8) 

In proposing (3.13.8), it is assumed that the gradient algorithm defined by (3.13.6) iterates 
at successive time instants. Thus, at time instant n + 1, the processor actually is operating 
with the weight w(n) computed at the previous iteration and the time instant and the 
array signal vector is x(n + 1). Note that for a given w(n), the estimate defined by (3.13.8) 
is unbiased, that is. 


E [g( w ( n ))|w(n)] = ~2E[y(n)q*(n)|w(n)] 

= -2E[{v H x(n) - w(n)U H x(n)}x H (n)u|w(n)] (3.13.9) 

= -2V h RU + 2w(n)U H RU 

A particular characteristic of the gradient used in (3.13.6) that is important in determin¬ 
ing the performance of the algorithm is the covariance. For the gradient estimate defined 
by (3.13.8), the following result on the convariance is established in Appendix 3.7. 

Let V g (w(n)) denote the covariance of the gradient estimate defined by (3.13.8) for a 
given w(n). If jx(n)} is an i.i.d. complex Gaussian sequence, then 


V (w(n)) = 4 U h RU[v h RV + w*(n)w(n)U H RU 
- w(n)U H RV-w*(n)V H RU] 


(3.13.10) 


Note that the quantity in the square brackets is the mean output power of the PIC for a 
given w(n). Thus, at each iteration the covariance of the gradient estimate is proportional 
to the mean output power of the PIC that the adaptive algorithm defined by (3.13.6) is 
trying to minimize. 

The convergence analysis of the algorithm defined by (3.13.6) is presented when the 
gradient estimate is defined by (3.13.8). In the event that jx(n)} is a sequence of i.i.d. 
random complex vectors, a detailed analysis of the algorithm is possible. The analysis is 
carried out using the approach described in Section 3.2. 
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3.13.2 Convergence of Weights 

The following result on the convergence of weights is established in Appendix 3.7. For 
the algorithm defined by (3.13.6) and (3.13.8), if |x(n)} is an i.i.d. random vector sequence, 

|w(0)|<°o (3.13.11) 

and 

0 < p < —t) 1 — (3.13.12) 

P U h RU 

then 

limE[w(n)] = w (3.13.13) 


and the convergence of E[w(n)] to w has the time constant given by 


1 

ln(l-2pU H RU) 


(3.13.14) 


where ln(.) denotes the natural logarithm of (.). 

Note that the step size p and the convergence time constant x are dependent on U H RU, 
the average power at the output of the interference beamformer, and are independent of 
the output power of the signal beamformer. 


3.13.3 Covariance of Weights 

Let K w (n) denote the covariance of weight w(n) at the nth iteration, that is, 

K ww( n ) = E [( w (n)-w(n))(w(n)-w(n)) *] (3.13.15) 

where 

w(n) = E[w(n)] (3.13.16) 

The covariance of the weight K mv (n) satisfies the following recursive relation: 


K ww (n + l) = K ww (n)ri-4pU H RU + 4p 2 (u H RU) 2 ] + p 2 E[V g (w(n))] (3.13.17) 


where the expectation is taken over w. A derivation of this recursive equation is provided 
in Appendix 3.7. 

Since the covariance of the weight at the (n + l)th iteration depends on the covariance 
of the gradient estimated for a given weight at the nth iteration, it is possible to further 
simplify the above recursive relation for a particular method of gradient estimate. When 
the gradient estimate used in (3.13.6) is defined by (3.13.8), an expression for V g (w(n)) is 
given by (3.13.10). The expression is derived with the assumption that jx(k)} is an i.i.d. 
complex Gaussian sequence. This assumption is necessary for the results presented 
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throughout the remainder of this section. Taking the unconditional expectation on both 
sides of (3.13.10) and substituting in (3.13.17), the following difference equation for the 
covariance of the weight results: 


K ww (n +1) = Kjn)[l - 4pU H RU + 8p 2 (u H Rll) 2 

+ 4p 2 U H RU[v H RV + w*(n)w(n)U H RU 
-w*(n)V H RU - w(n)U H RV] 


(3.13.18) 


3.13.4 Transient Behavior of Weight Covariance 

Let 

H = 1 - 4pU H RU + 8p 2 (u H RU) 2 

and 

D(n) = U H RU[V H RV + w*(n)w(n)U H RU 
- w*(n)V H RU-w(n)U H RV] 


Since 


lim w(n) = w 

n— 


it follows from (3.13.5) and (3.13.20) that 


lim D(n) = V h RVU h RU - V H RUU H RV 


From (3.13.18) to (3.13.20), it follows that 


K ww (n + l) = K ww (n)H + 4p 2 D(n) 

which has the solution 


K ww (n) = H 11 K ww (0) + 4p 2 £D(n-i)H‘- 1 

i=l 

where K mv (0) is the covariance of w(0). 

Since w(0) is a deterministic scalar, it follows that 

K ww (0) = 0 

and thus (3.13.24) reduces to 


(3.13.19) 


(3.13.20) 


(3.13.21) 


(3.13.22) 


(3.13.23) 


(3.13.24) 


(3.13.25) 
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(3.13.26) 


K ww (n) = 4^ 2 ^D(n-i)H i - 1 

i=l 

which completely describes the transient behavior of the covariance of the weights when 
the gradient estimate used in (3.13.6) is defined by (3.13.8). 


3.13.5 Steady State Behavior of Weight Covariance 

Take the z transform on both sides of (3.13.23): 

zK ww (z) = K ww (z)H + 4p 2 D(z) (3.13.27) 

where K mv (z) and D(z) are the z transforms of K ww (n) and D(n), respectively. 

Solving for K lvw (z), from (3.13.27), 

K ww (z) = 4p 2 ^4 (3-13.28) 

Since D(z) is stable, it follows from (3.13.28) that the stability of K ww (z) is guaranteed if 

|H| < 1 (3.13.29) 

which, along with (3.13.19), implies that if 


0<p < 


1 

u h ru 


(3.13.30) 


then K w (z) is stable. Thus, lim K wiv (n) exists. This proves the existence of the limit. 
To obtain the steady-state expression for the weight covariance, let 


K = lim K (n) 

ww ww V / 

n— 


(3.13.31) 


Since 


limK (n) = lim K (n + 1) 

ww V / v ww V / 

n—n—»°o 


(3.13.32) 


it follows from (3.13.19), (3.13.22), and (3.13.23) that 

g. _ V h RVU h RU - V h RUU h RV 
ww - h uHRujj_ 2 pU H RU] 

which, along with (3.13.4) and (3.13.5), leads to 


(3.13.33) 
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(3.13.34) 


K _ u P (^> 

ww P 1 — 2 (o.U h RU 

where P(w) is the mean output power of the optimal PIC. 

Thus, the steady-state weight covariance is proportional to the mean output power of 
the optimal PIC. 


3.13.6 Misadjustment 

In the absence of noise in the weight, the adaptive algorithm defined by (3.13.6) and 
(3.13.8) would converge to a steady state or optimal point on the mean output power 
surface. The minimum mean output power of the PIC therefore would be P(w). However, 
the noise in the weight tends to cause the steady-state solution to vary randomly about 
the minimum or optimal point. This results in excess power in the output power of the 
PIC; the amount of excess power depends on the weight covariance. 

As discussed previously, misadjustment is a dimensionless measure of the difference 
between the adaptive and optimal performance of a processor. It is defined as the ratio 
of the excess mean output power to the mean output power of the optimal PIC, that is. 


M = lim 

n— 


E[p(w(n))]-P(w) 

W) 


(3.13.35) 


In this section, analysis of the misadjustment is presented and an exact expression for it 
is derived when the gradient algorithm defined by (3.13.6) and (3.13.8) is used to estimate 
the weight given by (3.13.5). 

Taking the expected value on both sides of (3.13.4) and using (3.13.15) and (3.13.16), 


E[p(w(n))] = V h RV + K ww (n)U H RU + w* (n)w(n)U H RU 
- w* (n)V H RU - w(n)U H RV 


(3.13.36) 


Taking the limit as n —> °° and subtracting P(w) on both sides of (3.13.36), an expression 
for the steady-state excess mean output power follows: 


lim E[p(w(n))] - P(w) = K ww U H RU 


(3.13.37) 


Let M P denote the misadjustment in PIC. Equations (3.13.37), along with (3.13.35), imply 
that 


K IJ 

M P = ww , — (3.13.38) 

P(w) 

A substitution for K wlv from (3.13.34) in (3.13.38) leads to the following expression for the 
misadjustment: 
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.. u h ru 

M p = Lt-77- 

p 1 -2(iURU 


(3.13.39) 


It follows from (3.13.39) that misadjustment in the adaptive PIC is independent of the 
signal in the signal channel and depends only on the mean power at the output of the 
interference beamformer. Furthermore, for a very small step size (i, it is proportional to 
this power. Thus, given this misadjustment, it is desirable that the interference beamformer 
weight U is chosen such that U H RU is a smaller quantity. 

However, it follows from (3.13.14) that if 

2pU H RU <§ 1 (3.13.40) 

then 

t— ^pU H RU (3.13.41) 

Thus, a smaller power in the interference channel results in a longer convergence time 
constant, which may not be desirable. 

For the range of |i that satisfies (3.13.40), the misadjustment given by (3.13.39) can be 
approximated as 


M p — pU H RU (3.13.42) 

which, along with (3.13.41) implies that the product of misadjustment and the convergence 
time constant is given by 


M p - x = 0.5 

and is independent of array geometry and noise parameters. 


(3.13.43) 


3.13.7 Examples and Discussion 

The example presented here is for a planar array of ten elements as shown in Figure 2.7. 
The array consists of two rings of five elements each, with half-wavelength inter-ring 
spacing ji 0 . The radius of the inner ring is 4 (i 0 . 

A unity power signal source is assumed in the direction of the positive x-axis and an 
interference source is assumed in the direction of the negative x-axis. The interference 
power is taken to be 20 dB more than the signal power, and the uncorrelated noise power 
is taken to be 20 dB less than the signal power. The interference beam of the PIC is formed 
using 


U = PS : (3.13.44) 

and 

S S H 

P = I—(3.13.45) 

where S 0 and S t are the steering vectors in the directions of the signal and interference 
sources, respectively, and I is the identity matrix. 
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FIGURE 3.14 

The output power averaged over 50 runs vs. the iteration number. (From Godara, L.C., /. Acoust. Soc. Am., 85, 
194-201, 1989 [God89a], With permission.) 

The interference beam formed using (3.13.44) and (3.13.45) ensures that the interference 
beam has a unity response in the interference direction and a null response in the signal 
direction. The signal beam is formed using the conventional weight, that is, 

S H 

V = 0 (3.13.46) 

The algorithm is initialized with 

w(0) = 0 (3.13.47) 

The gradient step size of 1 x 10 -5 is used, which is about one-eighth of the inverse of 
the estimated power of the interference beam. The power estimate at the output of the 
interference beam is made by averaging 100 samples. Figure 3.14 shows the PIC output 
power averaged over 50 runs as a function of the number of iterations. The figure shows 
that the output of the processor converges to the signal power in about 15 iterations. Figure 
3.15 shows the norm of the weight error, that is. 


[(w(n) - w) * (w(n) - w)j / (3.13.48) 

averaged over 50 runs as a function of the number of iterations. Convergence of the norm 
of the weight error is evident in the figure. 


3.14 Signal Sensitivity of Constrained Least Mean Squares Algorithm 

The convergence of mean weights estimated by constrained LMS algorithm to optimal 
weights is a function of the eigenvalues of PR N P, and thus is independent of the look 
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FIGURE 3.15 

The norm of weight error averaged over 50 runs vs. the iteration number. (From Godara, L.C., /. Acoust. Soc. 
Am., 85, 194-201, 1989 [God89a], With permission.) 

direction signal. However, this is not the case for the weight covariance matrix, which 
depends on the projected covariance of the gradient used for the weight update algorithm, 
that is, PV g (w(n))P. For the standard algorithm, this variance is a product of the array 
correlation matrix R and the mean output power w^njRwjn) at the nth instant of time. 
Thus, PV g (w(n))P, which is proportional to w 1 '(n)Rw(n)PRP, contains a signal from the 
look direction, indicating that the performance of the standard LMS algorithm is not 
independent of the signal and that the transient behavior of weight covariance depends 
on it. 

Results presented in Section 3.6 show that the weights estimated by the standard algo¬ 
rithm are sensitive to signal power in the look direction. As signal power increases, the 
noise in these weights tends to increase. The following, a rather heuristic argument, 
explains this phenomenon [God97], 

Rewrite the constrained LMS algorithm as follows: 


w(n +1) = Pw(n) + S 0 /L - pPg(w(n)) (3.14.1) 

and examine the term Pg(w(n)) for various estimates of the gradient. First, consider the 
true gradient, that is. 


g(w(n)) = 2Rw(n) 

Expressing R in the form 


(3.14.2) 


R = p s S 0 S» + R N (3.14.3) 

it follows that 

Pg(w(n)) = 2R N w(n) (3.14.4) 

Thus, the estimate of w(n + 1) for a given w(n) does not depend on the signal power in 
the look direction when the true array correlation matrix is used in estimating the gradient. 
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Now consider the gradient estimate given by (3.4.4) and rewrite in the following form: 


g(w(n)) = x(n + l)x H (n + l)w(n) (3.14.5) 

where the factor 2 has been omitted for ease of analysis. 

The array signal vector x(n) can be expressed as 

x ( n ) = m s( n ) S o + x N ( n ) (3.14.6) 

where x^fn) is the array signal vector due to interference and uncorrelated noise only, and 
m s (n) is the sample of the complex modulating function of the signal. 

From (3.14.5) and (3.14.6), it follows that 


g(w(n)) = m s (n + l)m*(n + l)S 0 S^w(n) 
+ x N (n + l) x “(n + l)w(n) 

+ m s (n+ l)S 0 x ^(n + l)w(n) 
+ m;(n+l) x N (n + l)S«w(n) 

Since 


it follows from (3.14.7) that 


PS 0 = 0 


Pg(w(n)) = Px N (n + l)x“(n + l)w(n) 

+ m* (n + l)Px N (n ■+ l)S J*w(n) 


(3.14.7) 


(3.14.8) 


(3.14.9) 


The second term on the RHS of (3.14.9) contains mf (n + 1), which is a random quantity 
with variance equal to the look direction signal power. This makes Pg(w(n)) a noisy 
quantity that fluctuates with the signal power and causes the w(n + 1) to fluctuate. The 
fluctuations in w(n + 1) increase as the signal power increases. Thus, the weights estimated 
by the standard algorithm are sensitive to the signal power requiring a lower step size in 
the presence of a strong signal for the algorithm to converge which in turn reduces its 
convergence speed. 

This fact has been demonstrated in [Ohg93a] for a high-speed GMSK mobile commu¬ 
nications system. The system has been implemented by mounting an array on a vehicle 
to measure its BER performance. 

The signal sensitivity of the standard LMS algorithm is caused by the use of a sample 
correlation matrix in estimating the gradient, and could be reduced by using an estimate 
of the correlation matrix from all available samples as is done with the recursive LMS 
algorithm. In this case, variance of the estimated gradient is given as 


V (w(n))= -———2 w H (n) R w(n) R (3.14.10) 

8eV ' (n + l) 

Comparing this with the variance of the standard LMS algorithm, note that the variance 
of the gradient estimated by the recursive method is less than that estimated by the 
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standard method by a factor of (n + l) 2 . Thus, the recursive algorithm is less signal sensitive 
to signal power. In the limit as n increases, the signal sensitivity of the recursive LMS 
algorithm approaches zero. 

The signal sensitivity of the LMS also can be reduced by spatial averaging instead of 
sample averaging, as is the case in the structured gradient algorithm. Because of spatial 
averaging and the fact that m s (n + 1) and x K (n + 1) are not correlated, the dependence of 
Pg st (w(n)) on the signal level is substantially reduced. Thus, the weights estimated by the 
structured gradient algorithm are not very sensitive to the signal level in the look direction. 


3.15 Implementation Issues 

In this section, some implementation issues relating to finite precision arithmetic and real 
vs. complex implementation are discussed [God97]. 

3.15.1 Finite Precision Arithmetic 

The convergence speed, fluctuations in array weights during adaption, and misadjustment 
noise are the measures of the transient and the steady-state behavior of the LMS algorithm. 
Theoretical performance of the algorithm and the effect of the look direction signal and 
gradient step size discussed in previous sections assume the existence of infinite precision, 
that is, the variables are allowed to take any value. 

In real life, when the algorithm is implemented using digital hardware where variables 
can only take discrete values, other parameters affect its performance, and issues that 
must be considered include quantization noise as well as roundoff and truncation noise 
caused by finite precision arithmetic [Eva93, Ale87, Cha91, Leu91, Won91, Car84, Cio85]. 

First, when a b-bit quantizer is used to convert an analog signal of range -r,^ to r max 
into a digital signal, it adds quantization noise of zero mean and variance [Opp75], 



to the system. Second, the effect of finite word length of the devices where the numbers are 
stored causes the roundoff or truncation noise to be added to the system. This arises from 
the fact that when arithmetic operations are performed using these numbers, the answers 
are normally longer than the available word length and need to be rounded off or truncated 
to fit into finite word memory. Finally, all variables such as the estimated gradient, the 
gradient step size, and the estimated weights are only allowed to take finite values, and can 
be increased or decreased by a factor of 2. The combined effect of all these factors on the 
algorithm is a larger fluctuation in weights and a larger misadjustment than otherwise. 

The misadjustment appears to be the most sensitive to the finite word length effect on 
weights, suggesting that the weights should be implemented using a longer word length 
[Ale87] and a reduction in the step size below certain levels may even cause the misad¬ 
justment to increase [Car84] which is contrary to the infinite precision case where a 
decrease in the step causes the misadjustment to decrease. It appears [Cio85] that the finite 
word-length effects are amplified in the environment, which yields smaller eigenvalues 
for the correlation matrix. 
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An important effect of the finite word length on the weight update is that when a small 
input does not cause the weights to move more than the least significant bit (the smallest 
possible increment, which depends on the number of bits used to store weights), then the 
algorithm stalls and weights do not change anymore [Car84], requiring a bigger step size, 
which in turn increases weight fluctuations. 

A post-algorithm smoothing scheme suggested in [Cha91] appears to reduce weight 
fluctuations leading to better convergence performance. It suggests a running average of 
past weights. Thus, the weights are recursively updated using past weights with or 
without finite memory. Discussion on system design applicable to mobile satellite com¬ 
munications which takes into account quantization noise and other issues discussed above 
may be found in [Geb95]. 


3.15.2 Real vs. Complex Implementation 

In some situations, the input data to the weight adaption scheme are real, and in others, 
the data are complex (with real and imaginary parts denoting in-phase and quadrature 
components). In both of these cases, the weights could be updated using the real LMS 
algorithm or the complex LMS algorithm. The former uses real arithmetic and real vari¬ 
ables, and updates real weights (in-phase and quadrature component are updated sepa¬ 
rately when complex data are available), whereas the complex algorithm [Wid75] uses 
complex arithmetic and variables, and weights are updated as well as implemented as 
complex variables similar to the treatment presented in this book. For real data using 
complex algorithm, you need to generate the quadrature component using the Hilbert 
transformer or quadrature filter [Pap65], which has the following transfer functions: 



/> o 
/< o 


(3.15.2) 


For a similar misadjustment, the complex algorithm converges faster than the real algo¬ 
rithm. More details on this topic are available in [Hor81, God86]. Some of these issues are 
discussed below. 


3.15.2.1 Quadrature Filter 

The output of the quadrature filter is related to its input by the Hilber transform. Before 
deriving an expression for the quadrature filter transfer function given by (3.15.2), the 
Hilber transform is defined and some useful properties are stated. 

Let x(t) denote the Hilber transform of a real signal x(t) defined as 

x(t) = -f^W-dT (3.15.3) 

7T J t-T 

The Hilber transform has the following properties. First, the Hilber transform of the Hilber 
transform is the negative of the original signal, that is. 


*(t) = -x(t) 


(3.15.4) 
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A signal and its Hilber transform pair form an orthogonal pair, that is. 


| x(t)x(t)dt = 0 (3.15.5) 

The Hilber transform of a constant C is zero, that is, 

C = 0 (3.15.6) 

The Hilber transform of cos(cot) is sin(cot), co > 0. If 

x(t) = a(t)cos(27if o t + 0), f 0 >0 (3.15.7) 

such that the highest frequency of a(t) is less than fg, then 

x(t) = a(t) sin(27tf 0 t + 0), f 0 > 0 (3.15.8) 

Now a derivation of (3.15.2) is presented. It follows from (3.15.3) that x(t) is a convolution 
of x(t) and 1 / Jtt, that is. 


x(t) = x(t)*— (3.15.9) 

7Tt 

Thus, the Hilbert transform can be thought of as an output of a system (Hilber transformer) 
with an impulse response h(t) given by 


h(t) = — (3.15.10) 

7lt 

Let sgn(t) denote the sign function, that is, 

f+ t > 0 

sgn(t) = j_ t<Q (3.15.11) 

and F{.) denote the Fourier transform of {.[. Noting that 

F{sgn(t)} = ^ t (3.15.12) 

it follows from the duality theorem of the Fourier transform that 

F{^J = -jsgn(f) (3.15.13) 

Taking the Fourier transform on both sides of (3.15.10) and using (3.15.13), the following 
expression is obtained for the transfer function of the Hilber transformer, also known as 
the quadrature filter: 
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H(f) = -jsgn(f) 


(3.15.14) 


H f>o 
U f<o 

In the remainder of this section, a real valued signal is denoted by x t (t) and its Hilber 
transform is denoted by XQ(t). 

3.15.2.2 Analytical Signals 

A complex valued signal x(t) is said to be an analytical signal if its real and imaginary 
parts are related via the Hibert transform. Thus, it can be expressed as 

x ( t ) = x i( t ) + j x Q ( t ) (3.15.15) 


where x Q (t) = Xj(t). 

Taking the Fourier transform on both sides of (3.15.15) and using the properties of the 
Hilber transform, it can easily be shown that x(t) has a one-sided spectrum. 

3.15.2.3 Beamformer Structures 

Consider the structures of two narrowband beamformers shown in Figure 3.16 and Figure 
3.17. Figure 3.16 shows a real beamforming system and Figure 3.17 shows an in-phase 
and quadrature (IQ) or complex beamforming system. The real beamforming system has 
a single real valued output that can be produced by using real multiplication to achieve 
the weighting of the array signals. The other has a complex valued output and can be 
produced by using the complex multiplication to achieve the weighting of the array 
signals. 



Real beamforming system. 
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In-phase and quadrature beamforming system. 


Let the L-dimensional complex vector x(t) denote the array signal in complex notation 
defined as 


x ( t ) = x i( t ) + j x Q ( t ) (3.15.16) 

where the L-dimensional real vectors Xj(t) and x Q (t) denote the in-phase and quadrature 
array signals, respectively 

Define the L-dimensional complex weight vector w as 

w = Wj+jw Q (3.15.17) 

where the L-dimensional real vectors w, and w Q denote the weights as shown in Figure 
3.16 and Figure 3.17. 

Let y t (t) denote the output of the real beamforming system. It follows from Figure 3.16 
that it is given by 

Yi(t) = w ^ x i(t) + Wg x Q (t) 

= Re[w H x(t)] (3.15.18) 

= i[w»x(t) + x"(t)w] 

Let y(t) denote the output of the IQ beamforming system. It can easily be shown from 
Figure 3.17 that the output of the IQ beamforming system is given by 

y(t) = w H x(t) (3.15.19) 
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FIGURE 3.18 

Real algorithm for real beamforming system. 



FIGURE 3.19 

Real algorithm for IQ beamforming system. 


Similarly, the beamformer structure maybe developed when the reference signal is available. 

Next, an implementation of the two algorithms is discussed with a view to compare the 
difference in convergence speed. The development presented here is for the constrained 
LMS algorithm. It can easily be extended for the unconstrained case. 

3.15.2.4 Real LMS Algorithm 

Implementation of the real LMS algorithm for the real beamforming system is shown in 
Figure 3.18 and for the IQ beamforming system it is shown in Figure 3.19. When all signals 
on the array are accessible, a suitable estimate of the required gradient of w H Rw for w = 
w(n) is 

g R ( w ( n )) = 4x ( n + !)yi( n + !) (3.15.20) 
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FIGURE 3.20 

Complex algorithm for IQ beamforming system. 


In the real beamforming system, yi(n + 1) represents the only out of the system. In the IQ 
beamforming system, it is the real part of the output. In both cases, it is a real valued 
quantity and given by 


y,(n +1) = i[w H x(n +1) + x H (n + l)w(n)] (3.15.21) 

Note from (3.15.20) that real multiplications are used in estimating the real and imaginary 
parts of the complex valued quantity g R (w(n)). 

Using the result E[x(t)x(t)] = 0, it can easily be verified that the gradient given by (3.15.20) 
and (3.15.21) is unbiased; that is, for a given w(n) 


E[g R (w(n))] = 2Rw(n) (3.15.22) 

Let V gR (w(n)) denote the covariance of the gradient estimate given by (3.15.20) and 
(3.15.21). For a zero mean, stationary complex Gaussian vector process jx(k)}, it is given by 

V gR (w(n)) = 4w H (n)Rw(n)R + 8Rw H (n)w(n)R (3.15.23) 

The derivation of (3.15.23) can easily be carried out following the procedure used in Section 
3.4.2. 

3.15.2.5 Complex LMS Algorithm 

Implementation of the complex algorithm for the IQ beamforming system is shown in 
Figure 3.20, and for the real beamforming system in Figure 3.21. When all signals on the 
array are accessible, a suitable estimate of the required gradient of w H Rw for w = w(n) is 

g(w(n)) = 2x(n + l)y* (n +1) (3.15.24) 
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FIGURE 3.21 

Complex algorithm for real beamforming system. 


where 


„ . foutput when the complex system is used 

yn + l)= / , ... ; / . , (3.15.25) 

(y,(n +1) + j y : (n +1) when the real system is used 


As yj(n + 1) = y Q (n + 1), it follows from (3.15.25) that 

y(n +1) = w H x(n +1) 


(3.15.26) 


It follows from (3.15.24) and (3.15.25) that the gradient estimate in this case is identical to 
that for the standard LMS algorithm discussed in Section 3.4.1. Thus, the gradient cova¬ 
riance for this case is given by (3.4.6). Denoting it by V gc (w(n)) and rewriting (3.4.6) 


V g ( w ( n )) = 4w H (n)Rw(n)R 


(3.15.27) 


3.15.2.6 Discussion 

Comparing (3.15.23) and (3.15.27) one notes that 

V gR (w(n)) = V gc (w(n)) + 8Rw H (n)w(n)R (3.15.28) 

Thus, the covariance of the gradient used in the real algorithm is more than that used in 
the complex algorithm. The extra term 8Rw H (n)w(n)R, present for the case of the real 
algorithm, results in more misadjustment for this case. Let M R denote the misadjustment 
when the gradient is given by (3.15.20) and (3.15.21). Following the procedure used in 
Section 3.4 it can be shown that [God86] if 


0<p< 


1 


4^ 


max 


(3.15.29) 
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and 


i=l 


4 

1 - 2 ^ 


<1 


then the misadjustment is given by 


M 


R 




i 



l-2pi 


(3.15.30) 


(3.15.31) 


The misadjustment for the complex case is given by (3.4.52). Let it be denoted by M c . 
Comparing (3.15.30) with (3.4.52), one notes that 

M c (2p) = M R (p) (3.15.32) 

Thus, it follows that misadjustment in both cases would be same if the gradient step size 
used in the complex case is double that used in the real case. Since for small step size the 
convergence time constant is inversely proportional to step size, it follows that for the 
same misadjustment the convergence time constant for the complex LMS algorithm is half 
that of the real LMS algorithm. This means that for the same misadjustment, the conver¬ 
gence speed of the complex LMS algorithm is twice that of the real LMS algorithm. 
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Notation and Abbreviations 

|A|p 

Frobenius norm of a matrix A 

E[.] 

expectation operator 

E[xy] 

conditional expectation for given y 

In 

natural logarithm 

Tr(.) 

trace of (.) 

(•) H 

Elermitian transpose of (.) 

(•) T 

transpose of (.) 

CMA 

constant modulus algorithm 
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IQ 

LMS 

MSE 

RHS 

RLS 

SNR 

SMI 

TDMA 

F 

FU 

g(w(n)) 

g(w(n)) 

gi( w ( n )) 

g 2 ( w ( n )) 

g 3 (w(n)) 

gst( w ( n )) 

gR(w(n)) 

gi( w ( n )) 

g(w(n)) 

H(/) 

h(t) 

I 

i.i.d. 

Im[.] 

K 

KJn) 

IV WW 

Fww( ^) 
Fww 2 (^) 
F wvv ,(n) 
k 0 (n) 

L 

M 

M c 

M P 

Mr 

Me 

M U 

Mj 

M 2 

m 3 

m s( n ) 

N 

P 


in-phase and quadrature 

least mean squares 

mean square error 

right-hand side 

recursive least squares 

signal-to-noise ratio 

sample matrix inverse 

time-division multiple access 

normalized steering vector in look direction 

Fourier transform 

gradient estimate for given w(n) 

gradient estimate for given w(n) 

gradient estimate using single-receiver system for given w(n) 
gradient estimate using dual perturbation system for given w(n) 
gradient estimate using reference receiver system for given w(n) 
gradient estimate using structured gradient algorithm for given w(n) 
gradient estimate using recursive LMS algorithm for given w(n) and gradi¬ 
ent estimate using real LMS algorithm for given w(n) 
gradient estimate using improved LMS algorithm for given w(n) 
mean of the gradient estimate for given w(n) 
transfer functions of quadrature filter 
impulse response 
identity matrix 

independent identically distributed 
imaginary part of complex quantity 
degree of freedom 
covariance of w(n) 
covariance of w(n) in limit 
covariance matrix of w(n) 

covariance matrix of w(n) in dual perturbation system 
covariance matrix of w(n) in reference receiver system 
constant denoting w H (n)Rw(n) 
number of elements in array 
misadjustment, length of sequence S 
misadjustment in complex LMS algorithm 
misadjustment in adaptive PIC 
misadjustment in real LMS algorithm 
misadjustment in standard LMS algorithm 
misadjustment in unconstrained LMS algorithm 
misadjustment in single-receiver system 
misadjustment in dual perturbation system 
misadjustment in reference receiver system 
complex modulating function of signal 
number of samples 
projection operator 
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P(n) 

P(n) 

P(w(n)) 

P(w) 

PN(w(n)) 

P 

Pr 

Ps 

Q 

Q 

Q 

q(n) 

R 

R( n ) 

R(n) 

R(n) 

R(n) 

Re[] 

Rn 

RwJn) 

R(N) 

b 

S 

50 

51 

sgn(t) 

U 

Ui 

V 

V g (w(n)) 

V g (w(n)) 
V^Mn)) 
V gl (w(n)) 
V & (w(n)) 
Vg,(w(n)) 
V to (w(n)) 

v g,<( w ( n )) 

V^Mn)) 

v(n) 

w 

w(n) 

w(n) 

W MSE 


output power at nth iteration 
mean output power at nth iteration 
mean output power for given w(n) 
mean output power PIC for given w 
mean output noise power for given w(n) 
mean output power of optimal processor 
mean power of reference signal 
signal power 

matrix with columns being eigenvectors of R 
matrix with columns being eigenvectors of PRP 
eigenvector corresponding to X, 
output of interference beam at nth instant of time 
array correlation matrix 

estimate of R at nth instant of time using only one sample 

estimate of R at nth instant of time using past samples 

estimate of R at nth instant of time using spatial averaging 

estimate of R using past samples and spatial averaging 

real part of complex quantity 

array correlation matrix with no signal present 

correlation matrix of w(n) 

estimate of R using N samples 

correlation between elements with lag i 

complex vector sequence 

steering vector associated with look direction 

steering vector associated with interference 

sign function 

fixed weights of interference beam 
eigenvector corresponding to X, of R 
fixed weights of signal beam 
covariance of gradient for given w(n) 
covariance of gradient for given w(n) 

covariance of gradient in complex LMS algorithm for given w(n) 
covariance of gradient using single receiver system for given w(n) 
covariance of gradient using dual perturbation system for given w(n) 
covariance of gradient using reference receiver system for given w(n) 
covariance of gradient in structured LMS algorithm for given w(n) 
covariance of gradient in recursive LMS algorithm for given w(n) and co- 
variance of gradient in real LMS algorithm for given w(n) 
covariance of gradient in standard LMS algorithm for given w(n) 
mean error vector at nth iteration 
optimal weights of constrained processor 
array weights at nth iteration 
mean value of w(n) 

optimal weights of processor with reference signal 
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w optimal weight of PIC 

w(n) PIC weight at nth iteration 

w(n) mean value of w(n) 

x(n) array signal vector at nth instant of time 

x(t) Hilber transform of x(t) 

Xj(t) in-phase signal 

x ( -)(t) quadrature phase signal 

x-dh) array signal vector due to interference and uncorrelated noise 

y(t) output of IQ beamforming system 

y(n) array output at nth instant of time 

y(w(n)) array output for given w(n) 

y,(t) output of real beamforming system 

z correlation between reference signal and array signals 

r diagonal matrix of eigenvalues of P 

A diagonal matrix of eigenvalues of R 

A diagonal matrix of eigenvalues of PRP 

A diagonal matrix of nonzero eigenvalues of PRP 

£(n) diagonal matrix of eigenvalues of k ww (n) 

y perturbation step size 

y(w(n)) step size for which V gl (w(n)) is minimum 

8(1) L-dimensional complex vector 

e(w(n)) error between array output and reference signal for given w(n) 
q(y) perturbation noise for given y 

q(w(n)) MSE for given w(n) 
minimum MSE 

q(n) average value of MSE at nth iteration 

p gradient step size 

|i 0 inter-ring spacing 

p(n) gradient step size at nth iteration 

A-i ith eigenvalue of R 

A, ith eigenvalue of PRP 

^■max maximum eigenvalue of R 

A ITtlx maximum eigenvalue of PRP 

A vector of eigenvalues of R 

A vector of eigenvalues of PRP 

A vector of nonzero eigenvalues of PRP 

-8 vector of eigenvalues of P 

rij(n) ith eigenvalue of k ww (n) 

ri(n) vector of eigenvalues of k ww (n) 

Kl 2 (n) vector of eigenvalues of k WW2 (n) 

ri'(n) vector of nonzero eigenvalues of k mv (n) 

\|/(n) output of signal beam at nth instant of time 

x time constant for adaptive PIC 

Xj ith time constant 
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Appendices 
Appendix 3.1 

In this appendix a derivation of (3.2.90) is presented. First, the following theorem used in 
the derivation is established [Hor81, God86]. 

Theorem 3.1: Let the set of difference equations 


+ C;(n), i = 1,..., L (3A.1) 

be such that 


limC(n) = C* 

n— 

(3A.2) 

limCj(n) = 0 

(3A.3) 


I L 

CW + ^D^n) 

1=1 


limD.(O) < °° 

n—>°° 


and 


If 


all the eigenvalues of the system (3A.1) are real and positive. 


and 


then 


a 


: + Pmax < 1 


L L L 

; ) > 0, for 8 > 0 

i=l i=l 1» 


lim 

n —>oo 



exists and is given by 



t-E 

i=l 

r~ 


I! 

1-aj 

P, 

l-a ; 


where a, mx + p niax denotes the maximum value of a, + P;, i = 1, ..., L. 


(3A.4) 

(3A.5) 

(3A.6) 

(3A.7) 

(3A.8) 

(3A.9) 
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Proof of Theorem 3.1: The proof of the theorem makes use of the z-transform. If F(z) 
denotes the z-transform of a sequence f(n) denoted as Z{f(n)}, then z-transform of f(n + 1) 
is given by 

Z{f (n +1)} = zF(z) - zf (0) (3A.10) 

Taking the z-transform of the ith equation of (3A.1) and using (3A.10), 


+ z“ 1 c i (z) + D i (0) (3A.11) 


where D,(0) denotes the value of D;(n) at n = 0. 
It follows from (3A.11) that 


Dj(z) = z 1 ap i (z) + z 'Pj 


C( z ) + ^ D 1 (z) 


Let 




L 

^( z ) + ^ D 1 (z) 


z~ 1 c i (z) + D i (0) 


1-a.z 


Z 'P; 

L 

C( z )+Y'd 1 ( z ) 

l*i 

+ z ^(zj + D^O) 

l-(a ; + (] 



(3A.12) 


L. 

^( z )=52 D i( z ) 


(3A.13) 


It follows from the first equation of (3A.12) and (3A.13) that 

A_ z ~ lc i( z ) +D i(°) 


(z) = z-1 ^ 1 i [?( z ) + d(z)] + 

i=l i 

L n L 


-a.z 


z ^(zj + D^O) 
l-a ; z _1 


1 — z 


y P, 

l-a z _1 


(3A.14) 


It can easily be shown that the characteristic equation of (3A.14) is 

L L L 

Q(z-a i )-^(3 i f|(z-a 1 ) = 0 (3A.15) 

i = l i=l \*i 

For the stability of n(z), it is necessary that Dj(z) is stable Vj and that all roots of (3A.15) 
lie inside the unit circle. 
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It follows from the second equation of (3A.12) that the stability of Dj(z) is guaranteed if 


[otj + Pi | < 1 V; (3A.16) 

Equation (3A.16) follows from (3A.6). Thus, D;(z) is stable V r 
By assumption, all eigenvalues of (3A.1) are positive and real. This implies that all the 
roots of (3A.15) are positive and real. Since the sign of (3A.15) is positive for large values 
of z, it follows that if no root of (3A.15) is to lie between z = 1 and z = °°, then (3A.15) 
must be positive for z = 1 + 8 for all 8 > 0, that is, 

L L L 

riM-^&rK 1 + 8-a 1 )>0 for all 8 > 0 (3A.17) 

i=l i = l l*i 

which is true by (3A.7). l 

Thus, n(z) is stable and lim u(n) = ) D.(n) exists, which proves the existence of (3A.8). 

n->oo ^ 

i=l 

Equation (3A.9) is now established. From the properties of the z-transform, the final 
value of the sequence lim f(n) is given by 

limf(n) = lim(l-z _1 )F(z) (3A.18) 

It follows from (3A.2), (3A.3), (3A.4), and (3A.18) that 

hm(l - z _1 )^(z) = (3A.19) 

hm(l-z“ 1 )c i (z) = 0 (3A.20) 

and 

hm(l - z _1 )Dj (0) = 0 (3A.21) 

Multiplying both sides of (3A.14) by (1 - z -1 ), taking the limit z —>1, and using (3A.13) 
and (3A.19) to (3A.21), 


lim 

n— 




Pi 

1 - OC; 

l-a ; 


which is (3A.9). Thus, the theorem is proved. 
Proof of (3.2.90): Let 


D(n) = Ar|(n) 


(3A.22) 


(3A.23) 
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Noting that 


A1 = A. (3A.24) 

and premultiplying on both sides of (3.2.73) by A and using (3A.23), 

D(n +1) = {i - 4pA + 4p 2 A 2 + 4p 2 A 2 ll T }D(n) + 4p 2 ^(w(n))A 2 l (3A.25) 

The ith component of (3A.25) is given by 


Dj(n +1) = jl- 4pA, ; + 4p 2 A, 2 }D i (n) + 4p 2 A, 2 | ^(w(n)) + ^ D x (n) 


1=1 


(3A.26) 


Now apply Theorem 3A.1. Equation (3A.26) satisfies the form given by (3A.1) with 

oq = 1 - 4pA, ; + 4p 2 A, 2 
Pi = 4p 2 A, 2 
C(n) = q(w(n)) 

and 

c i(n) = 0 

It follows from (3A.29) that 

limC(n) = ^(w(n)) 

n-»°o 

= 1 

which satisfies (3A.2). Furthermore, (3A.30) implies that (3A.3) is satisfied and k ww (0) = 0 
implies that D^O) = 0, which in turn satisfies (3A.4). Since (3A.25) is propagated by a 
symmetric, positive, definitive transition matrix for all values of (i, this implies that all 
eigenvalues of the system (3A.25) are positive and real. This satisfies condition (3A.5). The 
condition (3A.6) is satisfied if 


(3A.27) 

(3A.28) 

(3A.29) 

(3A.30) 

(3A.31) 


or 


oq + pj < 1 Vj 


(3A.32) 


1 - 4(j.A,j + 8 (j. 2 A? < 1 V ; 


(3A.33) 
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This is satisfied if 


0<h<^— (3A.34) 

v max 

The condition (3A.7) is now checked. Substituting for a { and in (3A.7), 

L L L 

J ~[{8 + 4pA, i (l-pA, i )}-^4p 2 A,^{8 + 4pA, 1 (l-pA, 1 )}>0, for8>0 (3A.35) 

i=l i=l l*i 

It follows from (3A.34) that 


L 

{~J {8 + (l - jo-A-j )} > 0, for 8 > 0 (3A.36) 

i=l 


L 

Dividing (3A.35) by ni 5 + 4(iX i (l-(iX i )}, the following condition is derived: 

i=l 


I 








< 1, for 8 > 0 


This implies that (3A.7) is satisfied if 


(3A.37) 




<1 


(3A.38) 


Thus, when (3A.34) and (3A.38) are true, all conditions of the theorem are satisfied. It 
follows from (3A.8) that 

L 

limy D.(n) (3A.39) 

n->co ^ 

i=l 


exists and from (3A.9), 



i=l 


(3A.40) 


Noting that 

L 

yV(n) = \ T ifi(n) (3A.41) 

i=l 
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(3A.39) implies that 


(3A.42) 


lim \ T r|(n) 

n—»°o 

exists. 

Substituting for various quantities in (3A.40), 


lim \ T r|(n) 

n— 




i =l 

~L~ 


i-jA 

4 i-n^, 


which is (3.2.90). 


(3A.43) 


Appendix 3.2 

In this appendix, a derivation of (3.4.6) is presented. Rewrite (3.2.10): 

v g (w(n)) = E[g(w(n))g H (w(n))] - g(w(n))g H (w(n)) (3A.44) 

where E[.] denotes the conditional expectation for a given w(n) and 

g(w(n)) = E[g(w(n))] (3A.45) 

It follows from (3.4.5) and (3A.45) that 

g(w(n)) = 2Rw(n) (3A.46) 

Thus, the second term on the RHS of (3A.44) becomes 

g(w(n))g H (w(n)) = 4Rw(n)w H (n)R (3A.47) 

It follows from (3.4.4) that 

g(w(n))g H (w(n)) = 4x(n + l)x H (n + l)w(n)w H (n)x(n + l)x H (n +1) (3A.48) 

If jx(k)} is a complex i.i.d. Gaussian sequence, then for any Hermitian matrix A the 
following result holds, using (3.2.9): 

E[x(n)x H (n) Ax(n)x H (n)] = RAR + Tr(RA)R (3 A.49) 

Taking the conditional expectation on both sides of (3A.48) for a given w(n) and using 
(3A.49), 


© 2004 by CRC Press LLC 



E[g(w(n))g H (w(n))J = 4w H (n)Rw(n)R + 4Rw(n)w H (n)R 


(3A.50) 


Substituting from (3A.47) and (3A.50) in (3A.44), (3.4.6) is derived. 


Appendix 3.3 

In this appendix, a derivation of (3.4.8) and (3.4.9) is presented. It is similar to results of 
unconstrained algorithm presented in Sections 3.2.2 and 3.2.3. 

It follows from (3.4.1) and (3.4.4) that 


w(n +1) = P{w(n) - 2px(n + l)x H (n + l)w(n)} + ^ (3A.51) 

When w(n) and x(n + 1) are uncorrelated it follows by taking the unconditional expectation 
on both sides of (3A.51) that 


w(n +1) = P{ w(n) - 2pRw(n)} + °- 

where 

w(n) = E[w(n)] 

Define a mean error vector v(n) as 

v(n) = w(n) - w 

where w is the optimal vector given by (2.4.21), that is. 


(3A.52) 


(3A.53) 


(3A.54) 


w = 


R"s„ 

S^ R_1 S 0 


(3A.55) 


Subtracting w from both sides of (3A.52) and using (3A.54), the following mean error 
vector update equation is derived: 


v(n +1) = Pv(n) + Pw - 2pPRv(n) - 2pPRw -w+ 


From (3.4.2) and the fact that 


it follows that 


w H S„ = 1 


(3A.56) 


(3A.57) 


Pw = w-^ 

L 


(3A.58) 
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Similarly, (3.4.2) and (3A.55) imply that 


PRw = 0 


Thus, (3A.56) becomes 


v(n +1) = P(l - 2|iR)v(n) 


where I is an identity matrix. 

Since P 2 = P, it follows from (3A.60) that 

Pv(.) = v(.) 

Thus, (3A.60) becomes 

v(n +1) = (I - 2pPRP)v(n) 

= (l-2pPRP) n+1 v(0) 

Using the properties of a norm, it follows from (3A.62) that 


/v 1 n +l|i r /v -in+l 

!- 2 ^max v(°) ^ v( n + !)N l-2h?l min v(0) 


From (3.4.7), 


0<p< 


Thus, |l - 2|oA max | < 1 and implies that 


and 


lim 


lim 


l-2^ n 


l-2pX n 


= 0 


= 0 


Since ||v(0)|| < x, it follows from (3A.63) to (3A.66) that 

lim||v(n + l)|| = 0 

which along with (3A.54) implies that 

lim E[w(n) - w] = 0 


(3A.59) 


(3A.60) 


(3A.61) 


(3A.62) 


(3A.63) 


(3A.64) 


(3A.65) 


(3A.66) 


(3A.67) 


(3A.68) 
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This establishes 


lim E[w(n)] = w (3A.69) 


To obtain the convergence time constant along an eigenvector of PRP, consider 

L 

v(0) = yVQi (3A.70) 

i=l 

where cx,, i = 1, 2, L are scalars and Q ir i = 1, 2, L are eigenvectors corresponding 
to L eigenvalues of PRP 
From (3A.62) and (3A.70), it follows that 


L 

v(n +1) = (1 - 2pPRP) n+1 £ a,Q, 


Since eigenvectors of PRP are orthonormal, (3A.71) can be expressed as 


(3A.71) 


v(n+l) = ^(l-2pX i ) n+1 a i Q i (3A.72) 

i=l 

The convergence of the mean weight vector to the optimal weight vector along the ith 
eigenvector of PRP is therefore geometric with geometric ratio (1 - 2ja.X i ). If an exponential 
envelope of time constant is fitted to the geometric sequence of (3A.72), then 


-1 

Wi-2jji,jJ 

where the unit of time is assumed to be one iteration. Note that if 


(3A.73) 


then 

2pA,j « 1 

(3A.74) 


1 

T — — — 

2pA, ; 

(3A.75) 

Appendix 3.4 

In this appendix, a derivation of (3.4.14) is presented [God86]. It follows from (3.4.12) and 
(3.4.13) that 


k (n) = R (n)-w(n)w H (n) 

WWV / WW\ / V / V / 

(3A.76) 

where 

R ww( n ) = E[w(n)w H (n)] 

(3A.77) 
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Taking the outer product of (3.4.1), 


w(n + l)w H (n +1) = Pw(n)w H (n)P - pP g(w(n))w H (n) + w(n)g H (w(n))jP 


- h[Pg(w(n))F H + Fg H (w(n))p] 
+ p 2 [Pg(w(n))g H (w(n))p] 


(3A.78) 


+ FF h + Pw(n)F H + Fw H (n)P 


where F = S 0 /L. 

Taking the conditional expectation of both sides of (3A. 78) with respect to w(n) and 
using (3.4.5), 


E^w(n + l)w H (n + l)|w(n)j = Pw(n)w H (n)P 

- 2pP[Rw(n)w H + w(n)w H (n)R]P 

- 2p[PRw(n)F H + Fw H (n)RP] (3A.79) 

+ p 2 PE[g(w(n))g H (w(n))|w(n)]p 

+ FF h + Pw(n)F H + Fw H (n)P 

Taking the expectation on both sides over w(n), (3A.79) yields 

R (n+1) = PR (n)P-2pP[R (n)R + RR (n)]P 
- 2p[PRw(n)F H +Fw H (n)RP] 

(3 A.80) 

+ h 2 PE[g(w(n))g H (w(n))]p + FF H 
+ Pw(n)F H + Fw H (n)P 

Taking the expected value of (3.4.1), 

w(n +1) = P[w(n) - pg(w(n))j + F (3A.81) 

It follows from (3.4.5) and (3A.81) that 


w(n + l)w H (n +1) = Pw(n)w H (n)P - 2pP^Rw(n)w H (n) + w(n)w H (n)R]P 

- 2p[PRw(n)F H + Fw H (n)RP] + p 2 Pg(w(n))g H (w(n))P (3A.82) 
+ FF h + Pw(n)F H + Fw H (n)P 
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Subtracting (3A.82) from (3A.80) and using (3A.76), 


k w > +1) = Pk ww (n)P - 2gP[Rk ww (n) + k ww (n)R]P 

+ P 2p { E [g( w ( n ))g H ( w ( n ))] - E [g( w ( n ))] E [g H ( w ( n ))]} p 


(3A.83) 


By definition. 


V (w(n)) = E [g(w(n))g H (w(n))|w(n)] 

-E[g(w(n))|w(n)]E[g H (w(n))|w(n)] 


Using (3.4.5), it follows from (3A.84) that 

V g (w(n)) = E|g(w(n))g H (w(n))|w(n)J - 4Rw(n)w H (n)R 
From (3A.85), taking the expected value over w, 

E [V (w(n))] = E [g(w(n))g H (w(n))] - 4RR ww R 

which implies 

E [g( w ( n ))g H ( w ( n ))] = E [V g (w(n))]+4RRwwR 
From (3.4.5), taking expected value over w, 

E [g(w(n))] = 2Rw(n) 

The outer product of (3A.88) results in 


E[g(w(n))jE[g H (w(n))J = 4Rw(n)w H (n)R 
Subtracting (3A.89) from (3A.87) and substituting in (3A.83), 

k (n + l) = Pk (n)P-2uP[Rk (n) + k (n)RlP 

ww V / ww V / r L ww V / ww V / J 

+ p 2 4PRk ww (n)RP + p 2 PE[V g (w(n))]p 


which is (3.4.14). 


(3A.84) 


(3A.85) 


(3A.86) 


(3A.87) 


(3A.88) 


(3A.89) 


(3A.90) 
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Appendix 3.5 

In this appendix, a proof of the diagonalization conditions stated in Section 3.4.5 is 
presented [God86]. It follows directly from (3.4.1) and (3.4.12) that 


Pk (n)P = Pk (n) = k (n)P = k (n) 

ww\ / WW\ / WW\ / WW V / 


Thus, (3.4.14) can be expressed as 


k (n +1) = k (n)-2u PRP k (n)-2uk (n)PRP 

WW v / WW V / “ WW V / r WW v / 

+ p 2 PE[V g (w(n))]p + p 2 4 PRP k ww (n)PRP 


(3A.91) 


(3A.92) 


Since PRP is an Hermitian matrix, a unitary matrix Q exists, such that 


Q h PRPQ = A 


(3A.93) 


where A is a diagonal matrix with the diagonal elements being the eigenvalues of PRP. 
Define 


and 


£( n ) = Q H K ww (n)Q 


(3A.94) 


n(n) = Q H PE[V g (w(n))]PQ 


(3A.95) 


Pre- and postmultiplying (3A.92) by Q H and Q, respectively, and using (3A.93), (3A.94), 
and (3A.95), 


E(n +1) = [E(n) - 2pA E(n) - 2pE(n)A + 4p 2 AE(n)A] + p 2 tl(n) (3A.96) 

In view of (3A.93), (3A.94), and (3A.95), the statement of diagonalization conditions 
becomes "the necessary and the sufficient condition for E(n + 1), n > 0, to be a diagonal 
matrix is that D(n) is a diagonal matrix for all n." This is proved by induction. 

Consider n = 0. Since initial weight vector w(0) is a known constant, it follows that 

E[w(0)] = w(0) (3A.97) 


It follows from (3.4.12) that 


k ww(°) = E ( w (°) - w(0))(w(0) - w(0))' 


= 0 


(3A.98) 
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Equations (3A.94) and (3A.98) imply that 

E(0) = 0 (3A.99) 

From (3A.96) and (3A.99) one obtains 

E(l) = p 2 £2(0) (3A.100) 

It follows from (3A.100) that the necessary and sufficient condition for E(l) to be a diagonal 
matrix is that £2(0) is a diagonal matrix. This proves the diagonalization conditions for n = 0. 
Consider n = 1. Assume that E(l) is a diagonal matrix. For n = 1, (3A.96) becomes 

E(2) = [E(l) - 2pAE(l) - 2pE(l)A + 4p 2 AE(l)A] + p 2 £2(l) (3A.101) 

Since E(l) and A are diagonal matrices, it follows that the terms in the square bracket of 
(3A.101) form a diagonal matrix. Thus, it follows from (3A.101) that for E(2) to be a 
diagonal matrix, the necessary and sufficient condition is that £2(1) is a diagonal matrix. 
This proves the theorem for n = 1. 

Finally, assume that E(n) is a diagonal matrix. Since A is a diagonal matrix, the terms 
in the square bracket of (3A.96) form a diagonal matrix. Thus, it follows from (3A.96) that 
for E(n + 1) to be a diagonal matrix, the necessary and sufficient condition is that £2(n) is 
a diagonal matrix. This completes the steps necessary for the proof by induction. 


Appendix 3.6 

In this appendix, a proof of (3.4.52) is presented [God86]. Fet 


D(n) = A't)'(n) (3A.102) 

where ti'(n) is defined by (3.4.36), and denotes the L - 1 diagonal elements of Q H k ww (n)Q 
and A' is the diagonal matrix of F - 1 nonzero eigenvalues of PRP. 

It follows from (3.4.37), (3.4.38), and (3A.102) that 

D(n +1) = (i - 4pA' + 4p 2 \' 2 + 4p 2 A' 2 ll T )D(n) + 4p 2 A /2 l k 0 (n) (3A.103) 

From (3A.103), a difference equation of the ith component of D(.) is given by 


With 


Dj(n + l) = |l- + 4p 2 A,; 2 )Di (n) + 4p 2 A,( 2 


L—1 

k 0 (n) + £ D i(n) 


(3A.104) 


a ; = 1 - 4pXj + 4p 2 A, 2 
Pi =4 \y 2 i] 


(3A.105) 

(3A.106) 


© 2004 by CRC Press LLC 



C(n) = k 0 (n) = w H (n)Rw(n) 


(3A.107) 


and 


Cj(n) = 0 (3A.108) 

equation (3A.104) is similar to (3A.1) and Theorem 3A.1 can be applied, provided the 
conditions (3A.2) to (3A.7) are satisfied. 

Since 


lim w(n) = w (3A.109) 


it follows from (3A. 107) that 


limC(n) = w H Rw^* (3A.110) 


which is (3A.2). 

Equation (3A.108) implies (3A.3), k ww (n) = 0 implies that D,(0) = 0, which satisfies (3A.4). 

Note that (3A.103) is propagated by a symmetric, positive definite transition matrix for 
all values of p. This implies that all eigenvalues of the system (3A.103) are positive and 
real, which satisfies (3A.5). Following the argument used in (3A.32) to (3A.34), it can be 
shown that (3A.6) is satisfied if 


0<p< 



(3A.111) 


which is (3.4.50). Thus, (3A.6) is satisfied. 

Condition (3A.7) is checked in the following. Substituting for a, and (3; in (3A.7), the 
condition 


jQ [§+ 4 pi(i-pi)]-]"][ [ 5 + 

i i l*i 


411^(1-pi) 


>0 


for 8 > 0 


(3A.112) 


is derived. As (3A.111) implies that 


(3A.112) becomes 


J~| 8 + 4p 2 i(l-pA, i ) 


> 0 for 8 > 0 


I 


8 

4pA,; 


pA, ; 

+(1—pi-i) 


<1, 


8 > 0 


(3A.113) 


(3A.114) 
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This implies that (3A. 7) is satisfied if 




Er^<' 


(3A.115) 


which is (3.4.51). Thus, all conditions of Theorem 3A.1 are satisfied. Thus, lim > D^n) 
exists and from (3A.9) it follows that n_> °° i=i 


t-E 


pi 


lim 




'-“l 


1- 


gi 

1 l-oe 


(3A.116) 


It follows from (3A.102) and (3.4.47) and = 0 is that 


^D.(n)sX T d(n) 

i=l 


From (3A. 105), (3A.106), (3A.110), (3A.116), and (3A.117), 


lim X. T d(n) 

n—>°° 




(3A.117) 


(3A.118) 


which along with (3.4.49) implies (3.4.52). 

L-l 

Note that the existence of lim } D;(n), 0 < < x ;li and (3A.102) imply that limTn'( n ) 

n—n—»x 
i=l 

exists. This completes the derivation. 


Appendix 3.7 

In this appendix, the results presented in Section 3.13 are derived. The following result, 
which follows from (3.2.9), is used here. 

If |x(k)| is a complex i.i.d. Gaussian sequence, then for any Flermitian matrix A the 
following result holds: 


E[x(n)x H (n)Ax(n)x H (n)] = RAR + Tr(RA)R (3A.119) 

where R is the array correlation matrix and Tr(.) denotes the trace. 
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Proof of (3.13.10): Let 


g(w(n)) = E[g(w(n))|w(n)] (3A.120) 

Since V g (w(n)) denotes the covariance of the gradient estimate for a given w(n), it follows 
from the definition of covariance that 


V( w ( n )) = E[{g(w(n)) - g(w(n))}{g* (w(n)) - g* (w(n))}|w(n) 
= E[g(w(n))g* (w(n))|w(n)] - g(w(n))g* (w(n)) 

It follows from (3.13.3) and (3.13.8) that 


(3A.121) 


g( w ( n ))g* ( w ( n )) = 4{v)/(n) - w(n)q(n)}q* (n)q(n){\[/(n) - w(n)q(n)}* (3A.122) 

Substituting from (3.13.1) and (3.13.2), (3A.122) leads to 

g(w(n))g* (w(n)) = 4V H x(n)x H (n)UU H x(n)x H (n)V 

+ 4w* (n)w(n)U H x(n)x H (n)UU H x(n)x H (n)U 

(3A.123) 

- 4w(n)U H x(n)x H (n)UU H x(n)x H (n)V 

- 4w» (n)V H x(n)x H (n)UU H x(n)x H (n)U 

Taking the expectation over x(.) for a given w(n) on both sides of (3A.123) and using 
(3A.119), 

E|g(w(n))g* (w(n))|w(n)j = 4V H RUU H RV + 4U H RUV H RV 

+ 8w*(n)w(n)(u H Rll) 2 

- 8w(n)U H RUU H RV 

— 8w* (n)U H RUV H RU 

Since 

g(w(n))g* (w(n)) = 4w* (n)w(n)(u H RU) + 4V H RUU H RV 

- 4w(n)U H RUU H RV - 4w* (n)U H RUV H RU 


(3A.124) 


(3A.125) 


it follows from (3A.121), (3A.124), and (3A.125) that 


V g (w(n)) = 4U H RU[V H RV + w* (n)w(n)U H RU 
-w(n)U H RV - w* (n)V H RUj 


(3A.126) 


which proves the result. 
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Derivation of (3.13.13) and (3.13.14): Let 


Using (3.13.6) 


e(n) = w(n)- w 


e (n +1) = e(n) - gg(w(n)) 

Taking the expected value on both sides of (3A.128), 

E[e(n +1)] = E[e(n)] - pE[g(w(n))] 
From (3.13.9) and (3A.127), it follows that 


(3A.127) 


(3A.128) 


(3A.129) 


E[g(w(n))|w(n)j = -2V H RU + 2e(n)U H RU + 2wU H RU (3A.130) 

which along with (3.13.5), implies that 

E[g(w(n))|w(n)] = 2e(n)U H RU (3A.131) 

Taking the unconditional expectation on both sides of (3A.131) and substituting in 
(3A.129), 


Let 


Then, 


which implies 


For 


E[e(n +1)] = E[e(n)] - 2pE[3(n)]U H RU 


f (n) = E[e(n)] 


f (n +1) = (l - 2pU H RU)f (n) 


f (n +1) = (l - 2pU H RU) n+1 f (0) 


1 - 2pU H RU < 1 


(3A.132) 

(3A.133) 

(3A.134) 

(3A.135) 

(3A.136) 

(3A.137) 
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Furthermore, it follows from (3.13.11), (3A.127), and (3A.133) that 


|f(0)| < °° (3A.138) 

Thus, it follows from (3A.135), (3A.137), and (3A.138) that 

lim f(n) = 0 (3A.139) 


Equation (3A.139) along with (3A.127) and (3A.133) implies (3.13.13). 

To derive (3.13.14), consider (3A.135). It follows from (3A.135) that the convergence of 
the mean weight to the optimum weight is geometric, with the geometric ratio (1-2 (a.U H RU). 
If an exponential envelope of time constant x is fitted to the geometric sequence of (3A.135), 
then 


x = — 


ln(l-2pU H RU) 


(3A.140) 


which is (3.13.14). 

Derivation of (3.13.17): It follows from (3.13.15) and (3.13.16) that 


where 


K ww (n) = R ww (n)-w(n)w^(n) 
R ww (n) = E[w(n)w^(n)] 


(3A.141) 


(3A.142) 


It follows from (3.13.6) that 

w(n + l)w* (n +1) = w(n)w* (n) + p 2 g(w(n))g* (w(n)) 

- m* (n)g(w(n)) - hg* ( w (n))w(n) 

Taking the conditional expectation with respect to w(n) on both sides of (3A.143), 
E[w(n + l)w* (n + l)|w(n)j = E[w(n)w* (n)|w(n)j 


(3A.143) 


Since 


+ b 2 E[g(w(n))g* (w(n))|w(n)] 

- nE[w* (n)g(w(n))|w(n)j 

- |iE[g* (w(n))w(n)|w(n)] 

E [g* ( w ( n ))g( w ( n ))| w ( n )] = v g( w ( n )) 

+ E[g* (w(n))|w(n)]E[g(w(n))|w(n)] 


(3A.144) 


(3A.145) 
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i[g(w(n))|w(n) = 2w(n)U I 


RU-2VRU 


(3A.146) 


it follows from (3A. 144) that 


E|w(n + l)w * (n + l)|w(n)j = E|w* (n)w(n)|w(n)j |^1 + 4p 2 (u H RU)" - 4pU H RU 

+ p 2 V„(w(n)) + 4p 2 [v H RUU H RV 

s L (3A.147) 

-w* (n)U H RUV H RU - w(n)U H RUU H RV] 

+ 2|i[w* (n)V H RU + w(n)U H RV] 

Taking the unconditional expectation on both sides of (3A.147) and using (3A.142), 

R ww (n +1) = R ww (n)[l + 4p 2 (u H RU) 2 - 4pU H RU 

+ p 2 E[v (w(n))l + 4p 2 V H RUU H RV 

L 8 ' " (3A.148) 

- 4p 2 U H RU[w* (n)V H RU + w(n)U H RV] 

+ 2p[w* (n)V H RU + w(n)U H RV] 

Taking the unconditional expectation on both sides of (3.13.6) and using (3.13.16) and 
(3A.146), 


(3A.148) 


w(n +1) = w(n) - 2pw(n)U H RU + 2pV H RU 


(3A.149) 


Taking the outer product of (3A.149), 


7(n + l)w* (n +1) = w(n)w* (n) 1 + 4p 2 (u H RU) 2 - 4pU H RU 


+ 4p V RUU RV 


- 4p 2 U H RU[w* (n)V H RU + w(n)U H RV] 
+ 2p[w* (n)V H RU + w(n)U H RV] 


(3A.150) 


From (3.141), (3A.148), and (3A.150), one obtains 


> +1) = K ww (n)[l + 4p 2 (u H RU) 2 - 4p(u H RU) 
+ p 2 E[V g (w(n))] 


(3A.151) 


which is (3.13.17). 
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The beamformer structure of Figure 2.1 discussed earlier is for narrowband signals. As 
the signal bandwidth increases, beamformer performance using this structure starts to 
deteriorate [Rod79]. For processing broadband signals, a tap delay line (TDL) structure 
shown in Figure 4.1 is normally used [Rod79, May81, Voo92, Com88, Ko81, Ko87, Nun83, 
Yeh87, Sco83]. A lattice structure consisting of a cascade of simple lattice filters sometimes 
is also used [Ale87, Lin86, Iig85, Soh84], offering certain processing advantages. 

Although the TDL structure with constrained optimization is the commonly used struc¬ 
ture for broadband array signal processing, alternative methods have been proposed. 

Antennas 



x(t) W, x(t-T) w 2 x(t —(J-l)T) Wj 


FIGURE 4.1 

Broadband processor with tapped delay line structure. 
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These include adaptive nonlinear schemes, which maximizes the signal-to-noise ratio 
(SNR) subject to additional constraints [Win72]; a variation of the Davis beamformer 
[Dav67], which adapts one filter at a time to speed up convergence [Ko90]; a composite 
system that also utilizes a derivative of beam pattern in the feedback loop to control the 
weights [Tak80] to reject wideband interference; optimum filters that specify rejection 
response [Sim83]; master and slave processors [Hua90]; a hybrid method that uses an 
orthogonal transformation on data available from the TDL structure before applying 
weights [Che95] to improve its performance in multipath environments; the weighted 
Tschebysheff method [Nor94]; and the two-sided correlation transformation method 
[Val95]. 

In this chapter, details on an array processor using the TDL structure and its partitioned 
realization to process broadband array signals are provided, the time domain and fre¬ 
quency domain methods are described, and details on deriving various constraints are 
given [God95, God97, God99]. The treatment presented here is for solving a constrained 
beamforming problem, assuming that the look direction is known. It can easily be extended 
to the case when a reference signal is available. 


4.1 Tapped-Delay Line Structure 

In this section, a TDL structure for broadband antenna array processing is described, its 
frequency response and optimization are discussed, an LMS algorithm to estimate the 
solution of the point-constrained optimization problem is developed, and a design using 
minimum mean square error (MSE) between the frequency response of the processor and 
the desired response is presented. 

4.1.1 Description 

Figure 4.1 shows a general structure of a broadband antenna array processor consisting 
of L antenna elements, steering delays T,((|) o ,0), 1 = 1, ..., L and a delay line section of J - 1 
delays with inter-tap delay spacing T. The steering delays T[((|) O ,0 O ), 1 = 1, ..., L in front of 
each element are pure time delays and are used to steer the array in a given look direction 
((j)o,0 o ). If T, (<j) o ,0 o ) denotes the time taken by the plane wave arriving from direction (c|)o,0o)/ 
and measured from the reference point to the 1th element, then the steering delay T,((|) O ,0 O ) 
may be selected using 


TiMo) — Tq + Ti(<Mo)' 1/ ••• / L 


(4.1.1) 


where T 0 is a bulk delay such that T^o©;,) > OVp 
If s(t) denotes the signal induced on an element present at the center of the coordinate 
system due to a broadband source of power density S(f) in direction (<j),0), then the signal 
induced on the 1th element is given by s(t + x,(<j),0)), as discussed in Chapter 2. 

Let x,(t) denote the output of the 1th sensor presteered in ((]) O ,0 O ). It is given by 



(4.1.2) 


For a source in (cj)o0 o ) it becomes x,(t) = s(t - T 0 ), yielding identical wave forms after 
presteering delays. 
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The TDL structure shown in Figure 4.1 following the steering delay on each channel is 
a finite impulse response (FIR) filter. The coefficients of these filters are constrained to 
specify the frequency response in the look direction. It should be noted that these coeffi¬ 
cients are real compared to complex weights of the narrowband processor. 

It follows from Figure 4.1 that the output y(t) of the processor is given by 

L J 

y( t )=LL x ( t -( k - 1 ) T Kk ( 4 - L3 ) 

1=1 k=l 

where w lk denotes the weight on the kth tap of the 1th channel. Note that the kth tap 
output corresponds to the output after (k - 1) delays. Thus, first tap output corresponds 
to the output of presteering delays and before any tapped delays section, the second tap 
output corresponds to the output after one delay and Jth tap output corresponds to the 
output after J - 1 delays. 

Let W defined by 

W T =[w[, wj,..., Wj T ] (4.1.4) 

denote LJ weights of the filter structure, with w m denoting the column of L weights on 
the mth tap. 

Define an L-dimensional vector x(t) to denote array signals after presteering delays, that is, 

x (t) = [ x i(t), x 2 (t)' ■■■ / ( x L)f ( 4 -1.5) 

and an LJ-dimensional vector X(t) to denote array signals across the TDL structure, that is, 

X T (t) = [x T (t), x T (t-T),..., x T (t-(J-l)T)] (4.1.6) 

It follows from (4.1.3) to (4.1.6) that the output y(t) of the processor in the vector notation 
becomes 


y(t) = W T X(t) 


(4.1.7) 


If X(t) can be modeled as a zero-mean stochastic process, then the mean output power 
of the processor for a given W is given by 


where 


P(W) = E[y 2 (t)] 

= w t rw 


R = E[x(t)X T (t)] 


(4.1.8) 


(4.1.9) 


is an LJ x LJ dimensional real matrix and denotes the array correlation matrix with its 
elements representing the correlation between various tap outputs. The correlation 
between the outputs of mth tap on the 1th channel and nth tap on the kth channel is given by 


( R m,n) U = E [ X ^ “ ^ ^ ^ (4.L 10 ) 
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Note that the L x L matrix R mn denotes the correlation between the array outputs at the 
mth and nth taps, that is, after (m - 1) and (n - 1) delays. 

Substituting from (4.1.2), it follows that 

( R "A, k = P [( m " n ) T + T i(^o' 9 o)- T kKv e o) +x k(^ 9 )-h(i e )] (4.1.11) 

where p(x) denotes the correlation function of s(t), that is, 

p(x) = E[s(t)s(t + x)] (4.1.12) 

The correlation function is related to the spectrum of the signal by the inverse Fourier 
transform, that is. 


p(x) = Js(/)e^d/ 


(4.1.13) 


Thus, from known spectra of sources and their arrival directions, the correlation matrix 
may be calculated. In practice, it can also be estimated by measuring signals at the output 
of various taps. 

For M uncorrelated directional sources, the array correlation matrix is the sum of cor¬ 
relation matrices due to each source, that is. 


R = 




(4.1.14) 


where R is the array correlation matrix due to the 1th source in direction (<])„ 0,). 

Let R s denote the array correlation matrix due to the signal source, that is, a source in 
the look direction, and R N denote the array correlation matrix due to noise, that is, 
unwanted directional sources and other noise. The mean output signal power Ps(W) and 
mean output noise power Pn(W) for a given weight vector are, respectively, given by 


P S (W) = w t r s w 


and 


P N (W) = W t R n W 


The output SNR for given weights is 


SNR(W)= Fs ^ 

' P N (W) 

_ w t r s w 
“ w t r n w 


(4.1.15) 


(4.1.16) 


(4.1.17) 
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4.1.2 Frequency Response 

Assume that the signal induced on an element at the center of the coordinate system due 
to a monochromatic plane wave of frequency f can be represented in complex notation as 
e j 2 itft Thus, the induced signal on the 1th element after the steering delay due to a plane 
wave arriving in direction ((^,0) becomes ei 2%f ( t+ h^0)-Ti(cMo)). The frequency response H(f,(|) / 0) 
of the processor to a plane wave front arriving in direction (c|),0) is then given by 


H(f,4>,0) = ^e i2rtl(<, ' 9) e' i2,tfri( ^' 9o) ^w k e- i2,lf(k - 1)T 

1=1 k=l 

= S T (f^ / 0)T(t)£w k e-i 2ltf(k - 1 ) T 

k=l 


(4.1.18) 


where T(f) is a diagonal matrix of steering delays given by 


T(f) = 


]2^IT, (k'lfj .0(,) 


j2nfT 2 (^ 0 ,e 0 


-j2itfT L (^ 0 ,e 0 ) 


and S(f,(|) / 0) is an L-dimensional vector defined as 


S T (f (]> 0)= e j2jirT L (<i).e) 


It follows from (4.1.1), (4.1.19), and (4.1.20) that 

S T (f,^ o ,0 o )T(f) = a(f)[l, 1-1] 


(4.1.19) 


(4.1.20) 


(4.1.21) 


where 


a(f) = e“ i2ltfr ° (4.1.22) 

In this case, the frequency response of the array steered in the look direction (<t) (> 0 o ) is 
given by 


H(f,t,0 o ) = a(f)^f k e-i 2 « 

k=l 


where 


f k =l T w k , k = l, 2 ,...,} 


with 1 denoting a vector of ones. 


(4.1.23) 


(4.1.24) 
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Let f be a J-dimensional constraint vector defined as 


f = [fn4. 

and C be an LJ x J constraint matrix defined as 

0 ... O' 

1 

0 

..0 1 

The J constraints defined by (4.1.24) can now be expressed as 



(4.1.25) 


(4.1.26) 


C T W = f 


(4.1.27) 


Since a(f) given by (4.1.22) corresponds to a pure time delay, the J constraints jf k ) can be 
used to specify the frequency response in the direction (b o ,0o)- 
The processor can be forced to have a flat frequency response in the look direction by 
selecting f as follows: 


fl i = k 0 

}0 i^k Q 


(4.1.28) 


where k 0 is a parameter, which can itself be optimized. Frequently, k 0 is taken as J/2 for 
J, an even number, and (J + l)/2 for J, an odd number, since for a sufficiently large J this 
gives close to optimum performance. 


4.1.3 Optimization 

The frequency response of an array processor in the look direction can be fixed using the 
J constraints in (4.1.27). The processor can minimize the non-look direction noise when 
weights are selected by minimizing the total mean output power such that (4.1.27) is 
satisfied. Thus, in situations where one is interested in finding array weights, such that 
the array processor minimizes the total noise and has the specified response in the look 
direction, the following constrained beamforming problem is considered: 

minimize W t RW 

W (4.1.29) 

subject to C T W = f 

where f is a J-dimensional vector that specifies the frequency response in the look direction 
and C is an LJ x J constraint matrix. 

Let W denote the solution of the above problem. The solution is obtained by the Lagrange 
multipliers method [Bry69, Lue69, Pie69]. This method transforms the constrained problem 
into an unconstrained problem by adding the constraint function to the cost function using 
a J-dimensional vector of undetermined Lagrange multipliers X to generate a new cost 
function. Let J(W) denote the cost function for the present problem. It is given by 
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(4.1.30) 


J( W) = | W t RW + X T (c T w - f ) 

where 1/2 is added to simplify the mathematics. 

Taking the gradient of (4.1.30) with respect to W, 

V w j(W) = RW + C\ (4.1.31) 

At the solution point, the cost function gradient is zero. Thus, 

RW + CX = 0 (4.1.32) 

Assuming that the inverse of the array correlation matrix R exists, W may be expressed 
in terms of Lagrange multipliers as 


W = -R _1 C\ (4.1.33) 

Since W satisfies the constraint C W = f, it follows from (4.1.33) that 


-C t R _ 1 CX = f 


(4.1.34) 


An expression for Lagrange multipliers may be found from (4.1.34), yielding 

A. = -(C T R _1 C) f (4.1.35) 

Substituting for Lagrange multipliers in (4.1.33) from (4.1.35), an expression for the optimal 
weights [Fro72] follows: 


W = R- 1 C(C T R“ 1 C)“ 1 f (4.1.36) 

Let P denote the mean output power of the processor using optimal weights, that is, 

P = W t RW (4.1.37) 

Substituting for W from (4.1.36), 

P = f T (C T R _1 C) f (4.1.38) 

The point-constraint minimization problem (4.1.29) specifies J constraints on the weights 
such that the sum of L weights on all channels before the jth delay is equal to f r For all 
pass frequency responses in the look direction, all but one i ir i = 1, 2, ..., J are selected to 
be equal to zero. For i's close to J/2, f; is taken to be unity. Thus, the constraints specify 
that the sum of weights across the array is zero, except one near the middle of the filter 
that is equal to unity. 
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Thus, for all pass frequency responses when 4 i = 1, J are selected as 


fJ 1 i=k » 

[0 i ^ k 0 

equation (4.1.38) becomes 

p = (c^r'cV 1 


(4.1.39) 


(4.1.40) 


Application of broadband beamforming structures using TDL filters to mobile commu¬ 
nications has been considered in [Win94, Des92, Ish95, Koh92] to overcome multipath 
fading and large delay spread in TDMA as well as in CDMA systems. 


4.1.4 Adaptive Algorithm 

A constrained LMS algorithm to estimate the optimal weights of a narrowband element 
space processor is discussed in Chapter 3. The corresponding algorithm to estimate the 
optimal weights of the broadband processor given by (4.3.36) maybe developed as follows 
[Fro72], 

Let W(n) denote the weights estimated at the nth iteration. At this stage, a new array 
sample X(n + 1) is available and the array output using weights W(n) is given by 

y (n) = W T (n)X(n + l) (4.1.41) 

For notational simplicity it is assumed that the nth iteration coincides with the nth time 
sample. The new weight vector W(n + 1) is calculated by moving in the negative direction 
of the cost function gradient, that is, 

W(n +1) = W(n)-pV w j(w(n)) (4.1.42) 

where J(W(n)) is the cost function given by (4.1.30), with W replaced by W(n) and p is a 
positive scalar. Replacing R with its noisy sample X(n + l)X T (n + 1), it follows from (4.1.31) 
that 


V w j(w(n)) = X(n + l)X T (n + l)W(n) + C\(n) 
= y(n)X(n +1) + C\(n) 

where \(n) denotes the Lagrange multipliers at the nth iteration. 
Substituting from (4.1.43) in (4.1.42), 

W(n +1) = W(n) - py(n)X(n +1) - pC\(n) 


(4.1.43) 


(4.1.44) 


Assuming that the estimated weights satisfy the constraints at each iteration, it follows 
from the second equation of (4.1.29) that 
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C T W(n +1) = C T W(n) = f 


(4.1.45) 


Multiplying by C T on both sides of (4.1.44) and using (4.1.45), it follows that 

C T y(n)X(n +1) + C T C\(n) = 0 (4.1.46) 

Solving for \(n) 

X(n) = -(C T C) _1 C T y(n)X(n +1) (4.1.47) 

Substituting in (4.1.44), 

W(n +1) = W(n) - p y(n)PX(n +1) (4.1.48) 

where 

P = I-C(C T C) _1 C T (4.1.49) 


is a projection operator. It follows from (4.1.45) and (4.1.49) that 

PW(n) = W(n) - C(C T C) _1 f 


Thus, 


W(n) = PW(n) + C(C T C)“ 1 f 
and after substitution for W(n), (4.1.48) becomes 

W(n +1) = P[w(n) - py(n)X(n +1)] + F 

where 


(4.1.50) 


(4.1.51) 


(4.1.52) 


F = C(C T C) _1 f (4.1.53) 

Thus, knowing the array weights W(n), array output, and array sample X(n +1), the 
new weights W(n + 1) can be calculated using the constrained LMS algorithm given by 
(4.1.52), (4.1.53), and (4.1.49). 

The algorithm is initialized at n = 0 using 

W(0) = F (4.1.54) 

The initialization of the algorithm using weights equal to F is selected because it denotes 
the optimal weights in the presence of only white noise, that is, no directional interference. 
This follows from the fact that the array correlation matrix R in this case is given by 

R = o n 2 I (4.1.55) 
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Substituting in (4.1.36), it follows that 


w = c(c T c) _1 f 

= F 


(4.1.56) 


The convergence analysis of the algorithm may be carried out similar to that for the 
narrowband case discussed in Chapter 3. 

A substantial amount of computation in (4.1.52) is required to compute a multiplication 
between an LJ-dimensional vector and matrix P. The sparse nature of matrix C allows 
simplification of the algorithm with reduced computation as follows. 

It follows from (4.1.26) that 


C t C = LI 


where I is an identity matrix. 

Substituting in (4.1.53) and (4.1.49) yields 


and 


F = 


1 

L 


Cf 



0 


0 


f , 1 


P = I- 


CC T 

L 


= 1 



o 


0 

11 T 


(4.1.57) 


(4.1.58) 


(4.1.59) 


From (4.1.52), (4.1.58), and (4.1.59) an update equation in Wj(n), j = 0,1, ..., J - 1 may be 
expressed as [Buc86] 


Wj(n + 1) 


I- 


n i 


[ w j (n) - py (n)x(n +1 - j)] + y- 


(4.1.60) 


where Wj(n) denotes the L weights after the jth tap computed at the nth iteration, and x(n + 
1 - j) denotes the array signal after the jth tap. Thus, (4.1.60) allows iterative computation 
of J columns of weights separately. 

Noting that for an L-dimensional vector, a 


l T a = 


I 


a. 

i 


(4.1.60) may be implemented in summation form as [Fro72]: 


(4.1.61) 
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(4.1.62) 



i=i 


where (Wj(n)) denotes the 1th component of the weight vector Wj(n). 

4.1.5 Minimum Mean Square Error Design 

The processor design considered in Section 4.1.3 by solving constrained optimization 
problems given by (4.1.29) minimizes the mean output power while maintaining a spec¬ 
ified frequency response in the look direction. In this section, a processor design discussed 
in [Er85] is presented. This processor uses the TDL structure similar to that shown in 
Figure 4.1. The weights of the processor are estimated to minimize the MSE e 0 , between 
the frequency response of the processor in the look direction and the desired look direction 
response over a frequency band of interest [f[,f H ], defined as 


r H 



(4.1.63) 


f L 


where A(f,(|),0) denotes the desired frequency response in direction ((]),0). For a processor 
to have a flat frequency response in the look direction, it is given by 


A(f, <> 0 , 0 o ) = exp(j2jtfx) 


(4.1.64) 


where x denotes a delay parameter that may be optimized [Er85]. 

As the constraints on the weights are designed to minimize the deviation of the processor 
response from the desired response in the means squared sense, the presteering delays 
are not necessary. In this case, the presteering delays T,((|>o,0o), 1 = 1,2,..., L are set to zero. 
This is equivalent to the situation when matrix T(f) is not included in the frequency 
response expression (4.1.18). 

The processor also allows exact presteering as well as coarse presteering. For the exact 
presteering case, the steering delays are given by (4.1.1). This case is useful in comparing 
the performance of the processor using the minimum MSE design with that of the optimal 
processor discussed in Section 4.1.3. Coarse presteering arises when sampled signals are 
processed and steering delays are selected as the integer multiples of the sampling time 
closest to the exact delays required to steer the array in look direction. 

In the treatment that follows, it is assumed that steering delays Tj^Oq), 1 = 1, 2, ..., L 
are included in the design and the frequency response of the processor is given by (4.1.18). 
However, the values of T|(<t>o,0 o ) will depend on the case under consideration, that is, no 
presteering, coarse presteering, or exact presteering. 

4.1.5.1 Derivation of Constraints 

It follows from (4.1.63) that 




'0 


o 0 + W t QW - 2P t W 


(4.1.65) 
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where o 0 is a scalar given by 


<*o = j A* (f, 0 O ' e o) A ( f ' 0 O ' 9 o) df (4.1.66) 

f L 

f H 

W t QW = |h* (f, 0 O , e 0 )H(f, <|) 0 , e 0 )df (4.1.67) 

k 

f H 

P T W = iJ{A*(f, «|) 0/ 0 o )H(f, 0 O , 9 0 ) + H*(f, 0 O , 0 o )A(f, «|) 0/ 6 0 )}df (4.1.68) 

k 


Q is an LJ x LJ dimensional positive, semidefinite symmetrical matrix, and P is an LJ- 
dimensional vector. 

Substituting for H(f,(]) o ,0 o ) in (4.1.67) and (4.1.68) leads to the following expressions for 
Q and P [Er85]: 


Q k .i=v 


T i _x i) + ( T j“ T i) + ( n_m ) T 


k = i + (m-l)L, l=j + (n-l)L, i, j = 1, 2,..., L, m, n = l, 2,..., J 


(4.1.69) 


where 


\|/(x) = [f H sinc(27tf H x)- f L sinc(27tf L x) 


(4.1.70) 


with 


and 


where 


sinc(a) = 


sma 

a 


P = |p T P T P T f 

L 17 2 'J. 


(4.1.71) 


(4.1.72) 


[P k ] 1 = -|-J{ A *( f , 0o^ 0 o )e i2llf(Tl ~ Tl ~ (k ~ 1)T) + A(f, 0 O , 0 o )e“ i2,lf(Tl - Tl - (k - 1)T) }df ^ ^ 


(4.1.73) 


1=1/ 2,..., L, k = l, 2,..., J 

Let W denote an LJ-dimensional vector that minimizes e 0 . Thus, 

3e n 


aw 


= o 


(4.1.74) 
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It follows from (4.1.65) and (4.1.74) that W satisfies 

QW = P (4.1.75) 

Rewrite (4.1.65) using (4.1.75) as 

e 0 = (w - w) T q(w - w) - W t QW + a 0 (4.1.76) 

As the signal distortion depends on the allowed MSE between the desired look direction 
response and the processor response in the look direction over the frequency band of 
interest, the processor weights can be constrained to limit the MSE less than or equal to 
some threshold value 8 0 . Thus, an optimization problem can be formulated as discussed 
below. 

4.1.5.2 Optimization 

Consider the following optimization problem: 

minimize W h RW 

W (4.1.77) 

subject to e 0 < 8 0 

Defining an error vector 

v = W-W (4.1.78) 

and using (4.1.76), the optimization problem (4.1.77) becomes 

minimize (w-v) t r(w-v) 

V (4.1.79) 

subject to V t QV < q 

where 

% = W t QW + 8 0 - o 0 (4.1.80) 

Note that (4.1.80) follows from (4.1.76), (4.1.77), and (4.1.79). 

Let W £ be the solution of the optimization problem (4.1.77). It can be obtained using the 
Lagrange multipliers method as follows [Er85]. 

Let J(V,A) be the cost function defined as 

J(V , A) = (W - v) T R(w - v) + A(v t QV - q) (4.1.81) 

where A > 0 is the Lagrange multiplier. As J( V,A) is a convex function of V, the solution 
for any A is given by 
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rJj(VA) 

av , 


(4.1.82) 


Substituting from (4.1.81) it follows that 


(R + AQ)V(A) = RW 


(4.1.83) 


which implies 


V T (A)RV(A) + AV T (A)QV(A) = V T (A)RW 


(4.1.84) 


Substituting for V = V(A) in (4.1.81) and rewriting it as 


j( V(A), A.) = W t RW - W T RV(A) - V T (A)RW 

+ V t (A)RV(A) + AV t (A)QV(A)-A^ 


(4.1.85) 


and using (4.1.84), (4.1.85) becomes 


j( V(A), A) = W t RW - W t RV(A) - 


(4.1.86) 


It follows from (4.1.83) that 


V(A) = (R + AQ) _1 RW 


(4.1.87) 


Substituting for V(A) from (4.1.87) in (4.1.86), 


j(a)aj(v(a),a) 

= W t RW - W t R(R + AQ) _1 RW - 


(4.1.88) 


It follows from the duality theorem [Lue69] that the optimum Lagrange multiplier A 
can be obtained by maximizing J(A). Thus, it follows that 


(4.1.89) 


Substituting (4.1.88) in (4.1.89) yields 


-W Tr ^-( r + AQ) _1 RW = ^ 

a A >-=>. 


(4.1.90) 
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To carry out the partial differentiation of (R + A.Q) -1 , define an invertible matrix: 


A(A.) = (R + A.Q) 

(4.1.91) 

Thus, 


A(X)A _1 (X) = I 

(4.1.92) 

Carrying out the partial differentiation with respect to X results in 


=° 

(4.1.93) 

Hence, 


dA " {X) = A-\X) dA ^ A~\X) 
dA dA 

(4.1.94) 

Substituting for A(A.) yields 


■Jr(R + XQY 1 = -(R + XQ) _1 Q(R + XQ)" 1 

dA 

(4.1.95) 

(4.1.90) and (4.1.95) imply that X is the solution of 


w t r(r+a.q) 'q(r+?,q) RW=^ 

(4.1.96) 

(4.1.87) and (4.1.78) imply that W E , the solution of (4.1.77), is given by 


W E = w - |r+a.q| : rw 

(4.1.97) 


where W satisfies (4.1.75). 

See [Er85] for discussion of the processor when it has exact presteering and is designed 
for flat response over the entire frequency range (0,1 /2T). In this case, processor perfor¬ 
mance approaches that of the TDL processor discussed in Section 4.1.3, as 8 0 —> 0. 


4.2 Partitioned Realization 

The broadband processor structure shown in Figure 4.1 is sometimes referred to as an 
element space processor or direct form of realization compared to a beam space processor 
or partitioned form of realization. In the partitioned form, the processor is generally 
realized using two blocks as shown in Figure 4.2. The upper block forms a fixed main 
beam to receive the signal from the look direction and the lower block form auxiliary 
beams also known as secondary beams to estimate the noise (interferences and other 
unwanted noise) in the main beam. The lower block is designed to have no look direction 
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FIGURE 4.2 

Broadband processor structure with partitioned realization. 


Antennas 

w = l/L 



FIGURE 4.3 

Broadband processor structure with unconstrained partitioned realization. 


signal so that when its output is subtracted from the main beam it reduces the noise. The 
blocking of signal from the lower section may be achieved in several ways. 

In one case, the array signals are processed through a signal blocking filter before 
processing. Signal processing in this case solves an unconstrained optimization problem. 
This unconstrained partitioned processor is referred to as the generalized side-lobe can- 
celer and shown in Figure 4.3. 
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Broadband processor structure with constrained partitioned realization. 


The signal in the lower section may also be blocked using constraints on its weights. In 
this case, the weights of the lower section are estimated by solving a constrained optimi¬ 
zation. This form of realization is referred to as the constrained partitioned realization. Its 
block diagram is shown in Figure 4.4. Both forms of realization are discussed in this section. 


4.2.1 Generalized Side-Lobe Canceler 

The structure shown in Figure 4.3, also referred to as the generalized side-lobe canceler 
for broadband signals [Gri82], is discussed here for a point constraint, that is, the response 
is constrained to be unity in the look direction. Steering delays are used to align the wave 
form arriving from the look direction as discussed in the previous section for the element 
space processor. The array signals after the steering delays are passed through two sec¬ 
tions. The upper section is designed to produce a fixed beam with a specified frequency 
response and the lower section consists of adjustable weights. The output of the lower 
section is subtracted from the output of the fixed beam to produce the processor output. 

The upper section consists of a broadband conventional beam with a required frequency 
response obtained by selecting the coefficients fj, j = 1, ..., J of the FIR filter. Signals from 
all channels are equally weighted and summed to produce the output y c (t) of the conven¬ 
tional beam. For this realization to be equivalent to the direct form of realization, all 
weights need to be equal to 1/L and the filter coefficients fj, j = 1,..., J need to be specified 
as discussed in the previous section. The output of the fixed beam is given by 


J-i 

y F (t)=£f k+1 yc(t-Tk) (4.2.i) 

k=0 

with 
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(4.2.2) 


* T (t) 1 
L 

where x(t) denotes the array signal after presteering delays. 

The fixed beam output can be expressed using the vector notation as 

y F (t) = WpX(t) (4.2.3) 

where X(t) is an LJ-dimensional array signal vector defined by (4.1.6), W F is an LJ-dimen- 
sional fixed weight given by 



W F = C(C T C) _1 f (4.2.4) 

and C is the constraint matrix given by (4.1.26). Note that W F is identical to F defined by 
(4.1.53). 

The lower section consists of a matrix prefilter and a TDL structure. The matrix prefilter 
shown in the lower section is designed to block the signal arriving from the look direction. 
Since these signal wave forms after the steering delays are alike, the signal blocking can 
be achieved by selecting the matrix B such that the sum of its each row is equal to zero. 
For the partitioned processor to have the same degree of freedom as that of the direct 
form, the L - 1 rows of the matrix B need to be linearly independent. The output e(t) after 
the matrix prefilter is an L - 1 dimensional vector given by 

e(t) = Bx(t) (4.2.5) 

and can be thought of as outputs of L - 1 beams that are then shaped by the coefficients 
of the FIR filter of each TDL section. Let an L -1 dimensional vector v k denote these 
coefficients before the kth delay. The J vectors v F , ..., Vj correspond to the J columns of 
weights in the tapped delay line filter in the lower section. The lower filter output is then 
given by 


yA( t )=^ v ku e ( t ~ kT ) 

k=o 


(4.2.6) 


The output may be expressed in the vector notation as 


y A (t) = V T E(t) 


(4.2.7) 


where (L - 1)J dimensional vector V denotes the weights of the lower section defined as 



(4.2.8) 


and (L - 1)J dimensional vector E(t) denotes the array signals in the lower section defined 
as 


E(t) T = [e T (t), e T (t - T),..., e T (t - (J - 1)T)] (4.2.9) 


© 2004 by CRC Press LLC 



It follows from (4.2.3) and (4.2.7) that the array output is then given by 


y( t )=y F ( t )-yA( t ) 

= Wp X(t) - V T E(t) 

For a given weight V, the mean output power of the processor is given by 
P(V) = E[y 2 (t)] 


(4.2.10) 


= E 


{Wp X(t) — V T E(t)} 2 


(4.2.11) 


= WpRW F - WpR XE V - V T Ri E W F + V‘R ee V 


where 


R XE =E[x(t)E T (t)] (4.2.12) 

and 

R EE =E[E(t)E T (t)] (4.2.13) 

As the array signal vectors E(t) and X(t) are related through matrix B, both matrices Rxe 
and R ce could be rewritten in terms of R and B. 

Since the response of the processor in the look direction is fixed due to the fixed beam, 
and the lower section contains no signal from the look direction due to the presence of 
the matrix prefilter, nonlook direction noise may be minimized by adjusting weights of 
the lower section to minimize the mean output power. Thus, the optimal weights denoted 
by V are the solution of the following unconstrained beamforming problem: 

minimize P(V) (4.2.14) 

V 

Since the mean output power surface P(V) is a quadratic function of V, the solution of the 
above problem can be obtained by taking the gradient of the of P(V) with respect to V 
and setting it equal to zero. Thus, 


V v P(V)U=° (4.2.15) 

Substituting for P(V) from (4.2.11), 

R EE V = R£ E Wp (4.2.16) 

When the array correlation matrix R is invertible, the matrix R ffi is invertible and (4.2.16) 
yields 


V = R E X E W F 


(4.2.17) 
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It can be shown [Gri82] that when the weights in the array processors in Figure 4.1 and 
Figure 4.2 are optimized, the performance of the two processors is identical. The weights V 
may be expressed using array correlation matrix as follows. 

Let B be a matrix defined as 


B 


B = 


0 


0 


0 

B 


It follows from (4.2.4.), (4.2.9) and (4.2.18) that 


E(t) 


e(t) 

e(t + T) 

M 


e(t-(J-l)T) 


Bx(t) 

Bx(t-T) 

M 

Bx(t-(J-1)T) 


= BX(t) 


Substituting in (4.2.12) and (4.2.13) yields 


(4.2.18) 


(4.2.19) 


R XE =E[x(t)X T (t)]B T 
= RB t 


(4.2.20) 


and 


R ee = BRB t (4.2.21) 

It follows from (4.2.17), (4.2.20), and (4.2.21) that 

V = (BRB T ) _1 BRW F (4.2.22) 

Substituting in (4.2.10) from (4.2.19) and (4.2.22), the output of the processor with opti¬ 
mized weights becomes 


y(t) = W P T X(t) - W F T RB T (BRB T )~BX(t) 


= WJ 


I-RB 


(brb t )~ 


B 


X(t) 


(4.2.23) 
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4.2.2 Constrained Partitioned Realization 

Figure 4.4 shows a structure of the constrained partitioned realized processor [Jim77], The 
main difference between the constrained processor and unconstrained processor (also 
referred to as the generalized side-lobe canceler in the pervious section) is that the latter 
uses a signal blocking matrix to stop the signal from entering the lower section and solves 
an unconstrained beamforming problem, whereas the constrained processor uses con¬ 
straints on the weights of the lower section to eliminate the signal at the output of the 
lower section. Consequently, the optimization problem solved to estimate the weights of 
the lower section is a constrained one. 

Let the LJ-dimensional vector W F given by Figure 4.4 denote the weights of the fixed 
beam (upper section). Thus, the output of the upper section y F (t) is given by 

y F (t) = WpX(t) (4.2.24) 

Let the LJ-dimensional vector W denote the weights of the lower section. Thus, the 
output of the lower section y A (t) is given by 

y A (t) = W T X(t) (4.2.25) 

The processor output y(t) is the difference of the two outputs. Thus, 


y(t) = WpX(t) - W T X(t) 

= (w F -w) T x(t) 

The mean output power P(W) for given weights is given by 

p(w) = (w F - w) t r(w f - W) 


(4.2.26) 


(4.2.27) 


The lower section is designed such that its output does not contain the look direction 
signal. This is achieved by selecting its weights to be the solution of the following beam¬ 
forming problem: 


minimize (w f - w) t r(w f — w) 

W (4.2.28) 

subject to C T W = 0 

It follows from (4.1.26) and the second equation of (4.2.28) that 

l T Wj = 0, j = 1, 2,..., J (4.2.29) 

where Wj denotes the weights of the jth column, that is, before the jth delay in the lower 
section. 

Since the look direction signal wave forms on all elements after presteering delays are 
alike, the constraints of (4.2.29) ensure that the lower section has a null response in the 
look direction. Thus, the constraint in (4.2.28) achieves a null in the look direction similar 
to that achieved by the matrix prefilter B discussed in the previous section. 
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Let W 0 denote the solution of (4.2.28). Using the method of Lagrange multipliers dis¬ 
cussed in Section 4.1, 


W 0 = W F - R“ 1 C(C t R“ 1 C)“ 1 C t W f 

= w F -w 


(4.2.30) 


where W is given by (4.1.36). 

4.2.3 General Constrained Partitioned Realization 

In this section, a processor realization in general constrained form is presented where the 
upper section is designed to minimize the MSE between the look direction desired 
response and the frequency response of the processor in the look direction over a frequency 
band of interest [f, y f f f J, as discussed in Section 4.1.5. The lower section is designed such 
that its weights are constrained to yield a zero power response over the frequency band 
of interest to prevent signal suppression. Design details may be found in [Er86]. 

Let an LJ-dimensional vector W denote the weight of the upper section. These are 
designed using minimum MSE design and satisfy (4.1.75). The output of the upper section 
y F (t) is given by 


y P (t) = W T X(t) (4.2.31) 

Let an LJ-dimensional vector W denote the weights of the lower section. Thus, the output 
of the lower section y A (t) is given by 


y A (t) = W T X(t) (4.2.32) 

and the processor output y(t) is given by 

y (t) = (w-w) T X(t) (4.2.33) 

The mean output power P(W) for a given W is given by 

P(W) = (w-w) T r(w-w) (4.2.34) 


4.2.3.1 Derivation of Constraints 

Let weight vector W be constrained such that the power response of the lower section in 
the look direction is zero over the frequency band of interest, that is, 

% 

J H* (f,4> 0 ,e 0 )H(f,4> 0 ,e 0 )df = 0 (4.2.35) 

i 

It follows from (4.2.35) and (4.1.67) that 
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W T QW = 0 


(4.2.36) 


As Q is a positive semidefinite matrix, it can be factorized using its eigenvalues and 
eigenvectors as 


Q = UAU t (4.2.37) 

where A is a diagonal matrix with its elements being LJQ), i = 1,2, ..., LJ, the eigenvalues 
of Q, such that 


MQ) - MQ) - L ^ ^lj(Q) - 0 (4-2.38) 

and U is an LJ x LJ matrix of the eigenvector of Q, that is, 

U = [u i ,U 2 , ...,U LJ ] (4.2.39) 

where Uj, i = 1, 2, ..., LJ are the orthonormal eigenvectors of Q with the property that 

t fO i^j 

1^=-! . = (4.2.40) 

Substituting (4.2.37) in (4.2.36) 

W t UAU t W = 0 (4.2.41) 

Assume that Q has rank r| 0 . Thus, it follows from (4.2.39) and (4.2.41) that the necessary 
and sufficient conditions to satisfy (4.2.41) are 

W T Uj =0, i = l, 2,..., T1 0 (4.2.42) 

Thus, the linear constraints of the form (4.2.42) can be used to ensure that the lower section 
has a zero power response in the look direction over the frequency range of interest. It 
should be noted that signal blocking in the lower section using these constraints is inde¬ 
pendent of presteering delays, that is, the processor may include exact presteering, coarse 
presteering, or no presteering. 

4.2.3.2 Optimization 

Let the optimum weight vector W be the solution of the following constrained beamform¬ 
ing problem 


minimize (w-wj R^W-Wj 

W (4.2.43) 

subject to UjjW = 0 

where U^ 0 is the LJ x q 0 dimensional matrix given by 

u „=[ u .' u ’ . u „] < 4 - 244 ) 
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As Uj, i = 1, 2, rig are linearly independent, the matrix U n0 has full rank. Using the 
method of Lagrange multipliers. 


W = W-R'hd fu^ R _1 U_ ) _1 U^ W 

n 0 \ no no / no 


(4.2.45) 


4.3 Derivative Constrained Processor 

The implication of the point constraint considered in Section 4.1 is that the array pattern 
has a unity response in the look direction. It can be broadened using additional constraints, 
such as derivative constraints, along with the point constraint [Er83, Er90, Er86a, Thn93]. 
The derivative constraints set the derivatives of the power pattern with respect to (]) and 
0 equal to zero in the look direction. The higher the order of derivatives, that is, first order, 
second order, and so on, the broader the beam in the look direction normally becomes. A 
broader beam is useful when the actual signal direction and known direction of the signal 
are not precisely the same. In such situations, the processor with the point constraint in 
the known direction of the signal would cancel the desired signal as if it were interference. 
Other directional constraints to improve the performance of the beamformer in the pres¬ 
ence of the look direction error include multiple linear constraints [Tak85, Buc87, Gri87] 
and inequality constraints [Ahm83, Ahm84, Er90a, Er93]. 

In this section, some of these constraints are derived, a beamforming problem using 
these constraints is formulated, an algorithm to estimate solution of the optimization 
problem is presented, and the effect that choice of coordinate system origin has on the 
performance of an array system using derivative constraints is discussed. 

Derivative constraints are derived by setting derivatives of the power response p(f,4>,0) 
with respect to (j) and 0 to zero in direction (cj)(>0o)- Since H(f,(|),0) denotes the frequency 
response of the processor, it follows that 

p(f,<|>,0) = H*(f,<|>,0)H(f,<|>,0) (4.3.1) 

The first-order derivative constraints are now derived [Er83]. 


4.3.1 First-Order Derivative Constraints 

It follows from (4.3.1) that the partial derivative of the power response with respect to <|) 
is given by 


d P _ H * dH | 3H* H 

3(|) 3(|) 3(f) 


(4.3.2) 


where the parameters of p(f,(j),0) and H(f,(j),0) are omitted for ease of notation. It follows 
from (4.1.18) that 


m = ds T (f,^,0) ^ -j27if ( i-i)T 

3c(» 3<|) 1 


(4.3.3) 


© 2004 by CRC Press LLC 






Differentiating (4.1.20) with respect to (]), 


as T (f,4.,e) = 


3 (|) 


dx t ((|),9) c 2 nfr 1 (^ / e) L 3 x l (c|) / Q) c j 2 jtfx L (<|>,e) 


5 (|) 


5 (|) 


(4.3.4) 


= j27Tf S T (f^,0)A (<|>,0) 


where 


A*M) = 


^i(<|)/9) 

3 (|) 


3 (|) 


(4.3.5) 


and x,(<]),9) is given by (2.1.1). It can also be expressed as 

x 1 (^ e ) = -{(x cos (|) + y 1 sin (|)) sin 0 + z 1 cos 0} 


(4.3.6) 


where x„ y„ and z, denote the components of the 1th element along the x, y, and z axis, 
respectively, and c denotes the speed of propagation. 

Substituting (4.3.4) in (4.3.3) and noting that T(f) and A o (©,0) are diagonal matrices. 


3H . 
3 (|) 


J 

j27rfS T (f,(^,0)T(f)A^(^,0)^w 1 e- 


■j2jtf(l-l)T 


(4.3.7) 


which implies 


5H 

1 


l> 0 '®o) 


Noting from (4.1.21) that 


: )2"fS T (f,t,0 o )T(f)A^(^,0 o )^w 1 e- i: 


S T (f,t v 0 o )T(f) = a(f)l T 


—j2jtf (1—1)T 


(4.3.8) 


(4.3.7) yields 


3H 

3 (|) 




J 

:j27tfa(f)J2l T A 0 (^ o ,e o )wie- 


j2jlf(l-l)T 


(4.3.9) 


(4.3.10) 


It follows from (4.1.23) that 


H*(f 1 ,V0 o ) = a*(f)£f k e 


j2jlf(k-l)T 


(4.3.11) 
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Thus, 


= j2 7 :fa(f)a*(f)^£f k l T A + K,e 0 )w 1 e-' M( - k > T (4.3.12) 


Noting from (4.1.22) that a(f)a*(f) = 1 and using this in (4.3.12), 




-j2xf(l-k)T 


(4.3.13) 


Similarly, 


J J 


j2ltf(l-k)T 


(4.3.14) 


Substituting in (4.3.2), 


Similarly, 


“LLf 1 l T A,(4'.-e»)w 1 [)e- | ” (, - k|T -ie l " , ‘- k|T ; 


M) 1=1 k=l 


j j 

= 47lf EE^wm.)- . sin 27tf(l— k)T 


(4.3.15) 


? 1 1 

= 4 7rt^^ f k lT A e (t,e 0 )wiSin27tf( 1 - k ) T 


^O' 0 o) 1=1 k=l 


(4.3.16) 


where 


30 

KM= 0 


3t t (j). 


(4.3.17) 


It follows from (4.3.15) and (4.3.16), respectively, that sufficient conditions for —^— =0 

for all f > 0 are ^ (<i> 0 ,e 0 ) 


lTA it 9 »K = °, 1=1, 2,..., J 


(4.3.18) 


and sufficient condition for —f- = 0 for all f > 0 are 

30 u n \ 


© 2004 by CRC Press LLC 



^Ag^eJw^O, 1 = 1 , 2 ,...,] 


(4.3.19) 


Equations (4.3.18) and (4.1.19) denote 2J linear constraints on the weights of the broad¬ 
band processor. These constraints are sufficient for the first-order derivatives of the power 
response with respect to (]) and 0 evaluated at ((]),0) to be zero. These are referred to as the 
first-order derivative constraints and are imposed along with the point constraint dis¬ 
cussed previously. 

Using a similar approach to the derivation of the first-order constraints presented in 
this section, higher-order derivative constraints may be derived by setting the higher- 
order derivatives of the power response with respect to (]) and 0 evaluated at (c|)o,0o) to zero. 


4.3.2 Second-Order Derivative Constraints 

The equations describing the second-order derivative constraints follow [Er83]: 


3(|) 



w 1 = 0, 1=1,2,...,! 

bo' e o) 

(4.3.20) 

1 t A 2 ^ o ,0 o )w 1 = O, 1=1,2,...,J 

(4.3.21) 

t 3A 0 ((|),0) 

Wl = 0, 1=1,2,...,! 

(4.3.22) 

30 

boA) 


^Ag^ojw^o, 1=1,2,...,! 

(4.3.23) 

dA^,e) 

w 1 = 0, 1=1,2,...,! 

(4.3.24) 

1 30 


f<Mo) 


and 

l T A,(^o^o) A eK^o) w i = 0^ 1= 1/ 2,..., J (4.3.25) 

These equations denote 6J linear constraints that are sufficient for the second-order 
derivatives with respect to (j) and 0 evaluated at (c|) O/ 0 o ) to be zero. These are imposed along 
with the point constraint and first-order derivative constraints. 

It should be noted that these constraints depend on array geometry and are not neces¬ 
sarily linearly independent. In the next section, a beamforming problem with derivative 
constraints is considered. 


4.3.3 Optimization with Derivative Constraints 

A beamforming problem using derivative constraints may be formulated similar to the 
constrained beamforming problem considered previously by adding derivative constraints 
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specified by (4.3.18) to (4.3.25) to the point constraint given by the second equation of 
(4.1.29). 

In this case (4.1.29) becomes 

minimize W t RW 

W (4.3.26) 

subject to D t W = g 

where 

g T =[f T ,0 T -0 T ] (4.3.27) 

and 

D = [C 0 :Cj: L :C 8 ] (4.3.28) 

with LJ x J matrices C 0 to C 8 given by 


and 


C 0 =diag[ 1 ] 


C x =diag[l T A^((j) o ,0 o )] 
C 2 = diag[l T A 0 ((f) o , 9 0 )] 


C 3 = diag 




3(|) 

(4>o' e o)_ 


C 4 = diag[l T A^((|) 0 / e o )] 


C 5 = diag 


T dAel^,©) 


30 

( l l > 0 ' e o)_ 


C 6 = dia g[ lTA e(t' 9 o)] 


C 7 = diag 


.,t 3a om 


30 

(>l> 0 ' e o)_ 


(4.3.29) 

(4.3.30) 

(4.3.31) 

(4.3.32) 

(4.3.33 

(4.3.34) 

(4.3.35) 

(4.3.36) 


C 8 = diag[l T A^ o/ 0 o )A e (^ / 0 o )] (4.3.37) 
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The notation diag[x] in (4.3.29) to (4.3.37) is defined as 


diag[x] 


x 

0 


0 


0 

x 


For example, in (4.3.29) 


x = 1 


and C 0 is given by 


C 0 = 


0 

1 


It can easily be verified from (4.3.26) to (4.3.37) that 


cjw = f 

and 

C^W = 0, i = 1,..., 8 


(4.3.38) 


(4.3.39) 


(4.3.40) 


(4.3.41) 


(4.3.42) 


Equation (4.3.41) is the second equation of (4.1.29) and defines the point constraint, 
whereas (4.3.42) defines derivative constraints given by (4.3.18) to (4.3.25). 

The optimization problem (4.3.26) is similar in form to (4.1.29). Thus, it follows from 
(4.1.36) that if D is of full rank, then the optimal weight W, the solution of (4.3.26), is given 
by 


W = R _1 D(D T R _1 D) g (4.3.43) 

The rank of D is dependent on array geometry. This is explained in the following example 
using a linear array [Er83]. 

4.3.3.1 Linear Array Example 

Consider a linear array along the x-axis with x, denoting the position of the 1th element. 
Assume that the directional sources are in the x-y plane with the look direction making 
an angle (]) 0 with the array. In view of these assumptions, it follows that 

9 = 90°, y 1 = 0, z 1 = 0, 1=1, 2,..., L (4.3.44) 

These equations along with (4.3.6) imply that 

= ^ (4-3.45) 

c 
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(4.3.46) 


3t 1 ((|)) _ -x 1 sin(|) 
3(|) c 


and 


o <|) c 

Now consider the constrained Equation (4.3.18) to Equation (4.3.25). Using (4.3.44) and 
the fact the time delay T|(<J)) is not a function of 0, one notes that the constraint equations 
(4.3.19), (4.3.22), (4.3.23), (4.3.24), and (4.3.25) vanish. The only constraints remaining are 
those given by (4.3.18), (4.3.20), and (4.3.21), that is. 


lTA ^o) w i = 0 ' l=h2, ...,J 


3 (|) 


w=0, 1=1,2,.,.,J 


and 


l T A 2 ^ 0 ) Wl = 0, 1=1, 2,..., J 

where A^), 1 A <t> ((|))/ T (|) and Atyfy) are diagonal matrices given by 


A M = 


^i(^) 

3 (|) 


0 

3 (|) 


and 


3(f) 


3 2 (|) 


3 (|) 


A» = 


3 (|) 


3 (|) 


(4.3.48) 

(4.3.49) 

(4.3.50) 

(4.3.51) 

(4.3.52) 


(4.3.53) 
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To simplify the notation, define three L vectors X(<|)), tr(<])), and i|i(9) as 


M^)=i t a,W 

(4.3.54) 


"W - 11 *•<♦> 

(4.3.55) 


and 


^) = 1 t A 2 ,((^) (4.3.56) 

Using (4.3.45) to (4.3.47) and (4.3.51) to (4.3.53), these become 


M*o) 


sin(|) 0 

c 



(4.3.57) 


%-K) 


COS(|) 0 

c 


X 1 

M 


and 






The three constraint equations (4.3.48) to (4.3.50) are then given by 


(4.3.58) 


(4.3.59) 


^(l) w i = 0 ^ 

1=1, 2,..., J 

(4.3.60) 


1=1, 2,..., J 

(4.3.61) 


1=1, 2,..., J 

(4.3.62) 


Note that (4.3.60) denotes J first-order constraints equations, and (4.3.61) and (4.3.62) 
denote 2J second-order constraints. For a general array, there are 2J linear constraints and 
6J derivative constraints as discussed previously. Thus, the constraints for a linear array 
are much less than those for a general array. It should be noted that these constraints are 
functions of the look direction. 
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For look direction in broadside to the array § 0 = 90°, it follows from (4.3.58) that (r o (b 0 ) = 
0 and (4.3.61) vanish, reducing the constraints from 3J to 2J for a linear array. Similarly, 
for an endfire array where the look direction is parallel to the array, (]) 0 = 0° or c) n = 180°, 
(4.3.57) and (4.3.59), imply that both (4.3.60) and (4.3.62) vanish. Thus, for a linear array, 
only J second-order constraints given by (4.3.61) remain; first-order constraints have vanished. 

When a beamforming problem is considered using derivative constraints, the constraints 
equations specifying only linearly independent constraints need to be considered. It fol¬ 
lows from (4.3.57) and (4.3.58) that vectors X. o (<|) 0 ) and <r,,(bn) are not linearly independent; 
thus constraints (4.3.60) and (4.3.61) are not independent. Flence, only 2J constraints given 
by (4.3.60) and (4.3.62) need to be used in the optimization process. 

For this case beamforming problem given by (4.3.26) to (4.3.37) reduce to 

minimize W t RW 

W (4.3.63) 

subject to D t W = g 


where 


and 


with 


and 


g T =[f T ,0 T ,0 r 


(4.3.64) 


D = [C 0 :C i: C 2 


(4.3.65) 


C 0 = 


1 0 

0 

0 1 


(4.3.66) 


C,= 


o 

0 

o \K) 


(4.3.67) 


C 2 = 


0 

0 

0 \( 4 > 0 ) 


(4.3.68) 


For linearly independent vectors 1, A. o (<t> 0 ) and i|i 0 (bo)/ the constraint matrix D has full rank 
[Er83] and the beamforming solution is given by (4.3.43). 
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4.3.4 Adaptive Algorithm 

An estimate of the solution of the beamforming problem (4.3.63), which converges in mean 
to the optimal weights given by (4.3.43), may be made using a constrained LMS algorithm 
similar to that given by (4.1.52), (4.1.53), and (4.1.49). In this case, it becomes 

W(n +1) = P[W(n) - py(n)X(n +1)] + G (4.3.69) 

where the projection operator 


P = I-D(D T D) -1 D T (4.3.70) 

and 

G = D(D T D^) _1 g (4.3.71) 

The algorithm is initialized at n = 0 with 

W(0) = G (4.3.72) 


Note that the initial weight vector W(0) correspond to the optimal weight given by (4.1.43) 
in the presence of white noise only. 

Due to the sparse nature of matrices C (> C ,, and C 2 , the projection operator P is sparse 
and allows development of a temporally decoupled update equation to estimate the J 
columns of L weights similar to that discussed earlier for the point constraint. For this 


where 


and 


r is given by 

[Buc86] 


w j (n + l) = P 

w j( n )-hy( n ) x ( n+1 -j)+ f jC(c T c) e 1 

/ 

- 

(4.3.73) 

j = l, 2,..., J 




p = i-c(c t c)~c t 

(4.3.74) 


c = [i, x + (4> 0 ), «|i t (4> 0 )] 

(4.3.75) 


e[ = [l, 0, 0] 

(4.3.76) 


4.3.5 Choice of Origin 

In array system design, location of the time reference point (origin of the coordinate 
system) with respect to the array elements is chosen for notational convience. In most 
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cases, it is one element of the array or array's center of gravity These origin choices do 
not affect the array beam pattern or output SNR. However, this is not the case when 
derivative constraints are involved. The reason is that the constraint matrix D is a function 
of T,((|),0), which in turn depends on origin choice, as it denotes the time taken by a plane 
wave arriving from direction ((]),0) and measured from the origin to the array's 1th element. 
This dependence of the constraint matrix D on the choice of origin affects the solution of 
the constrained beamforming problem. Hence, the beam pattern and the output SNR of 
the beamformer using optimal weights depends on the choice of origin. 

The vector G used to initialize the adaptive algorithm is the optimal weight under white 
noise conditions and the output noise power is proportional to the norm of this weight, 
that is, G T G. In view of this, the chosen origin should minimize G T G [Buc86]. 

It follows from (4.3.71) that 


G T G = g T (D T D) _1 g (4.3.77) 

The first-order and the second-order derivative constraints discussed in this section so far 
are sufficient to ensure that the power response derivatives evaluated at the look direction 
are zero. However, these constraints are not the necessary and sufficient conditions for 
the derivatives to be zero. In what follows is a discussion on the first-order derivative 
constraints for a flat-response processor. For this case, these constraints are necessary and 
sufficient conditions which ensure that the array beam pattern is independent of the choice 
of origin [Er90]. 

The constraint vector f for the case of a flat frequency response in the look direction is 
given by (4.1.28), that is. 


fl i = k 0 

[0 i=£k 0 


(4.3.78) 


Substituting (4.3.78) in (4.3.15), it follows that 

J 

= 47tf^l T A^((|) o ,0 o )w 1 sin27tf(l-k o )T (4.3.79) 

(<t>o,e 0 ) 1=1 

If J is odd and k 0 = (J + l)/2, then (4.3.79) can be rewritten as 

= 47rf^l T A^ o ,0 o )(w ko+1 -w ko _ 1 )sin27rflT (4.3.80) 

(V e o) 1=1 

As the right hand side of (4.3.80) is a finite Fourier series, it follows that the necessary 
and sufficient conditions for 


Op 

3 (|) 


Op 

3 (|) 



= 0 


for all f > 0 are that all series coefficients are simultaneously equal to zero, that is. 
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(4.3.81) 


. . [l=l,2,...,k 0 -l, 

lTA o(4'o^ e o)( w k 0+ i- w k 0 -i) = k _J + 1 

l 0 2 

Similarly, the necessary and sufficient conditions for 

|P =0 

30 (4>o,e 0 ) 

for all f > 0 are 

, . fl=l,2,...,k 0 -l, 

l T Ae(‘l>o'0o)( w ko+1 -w ko - 1 ) = O, k = J + 1 (4.3.82) 

l 0 2 

Note that (4.3.81) and (4.3.82) denote J -1 linear constraints compared to 2J linear 
constraints given by (4.3.18) and (4.3.19). Discussion on second-order derivative con¬ 
straints may be found in [Er90], and an unconstrained partitioned realization of the 
processor with derivative constraints is provided by [Er86a]. 


4.4 Correlation Constrained Processor 

A set of nondirectional constraints to improve the performance of a broadband array 
processor using a TDL structure under look direction errors is discussed in [Kik89]. These 
are referred to as correlation constraints, and they use known characteristics of the desired 
signal to estimate an LJ-dimensional correlation vector r d between the desired signal and 
the array signal vector due to the desired signal, that is, 

r d = E[s d (t)X d (t)] (4.4.1) 

where s d (t) denotes the desired signal induced on the reference element, and LJ-dimen- 

sional vector X d (t) denotes the array signal across the TDL structure due to the desired 
signal only. 

The beamforming problem in this case becomes 

minimize W t RW 

W (4.4.2) 

subject to rJW = p 0 

where p 0 is a scalar constant that specifies the correlation between the desired signal and 
array output due to the desired signal, that is, 

Po=[ s d( t )y d ( t )] (4-4-3) 
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where 


y d (t) = w T x d (t) 


(4.4.4) 


For the desired signal with a flat spectrum over the frequency band of interest the con¬ 
straint in (4.4.2) becomes P [ W = 1 [Er93]. It can easily be verified that the solution W c of 
the beamforming problem (4.4.2) is given by 

W c =R- 1 r d (r d T R- 1 r d ) _1 p 0 (4-4.5) 


4.5 Digital Beamforming 

In this section, in a brief review of digital beamforming, the process of forming beams in 
various directions is described [God97]. First, consider the analog beamformer structure 
shown in Figure 4.5, where signals from all elements are weighted, delayed, and summed 
to form a beam. The output of the beamformer is given by 

L 

y( t )=52 w i x i(t-' c iW) (4-5.1) 

1=1 


The delay in front of each element is adjusted such that the signals induced from a given 
direction, where the beam needs to be pointed to, are aligned after the delays. The weights 
are adjusted to shape the beam. 

In digital beamforming [Pri78, Dud77, Muc84, Pri79, Fan84, Mar89, Rud69, Gab84, Bra80, 
Syl86, DeM77], the weighted signals from various elements are sampled, stored, and 
summed after appropriate delays to form beams. The required delay is provided by 
selecting samples from different elements such that the selected samples are taken at 
different times. Each sample is delayed by an integer multiple of the sampling interval A. 
The process is shown in Figure 4.6 for a linear array of equispaced elements where the 
samples of weighted signals are shown as circles. Weights on each element are not shown. 



FIGURE 4.5 

Delay and sum processor structure. 
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FIGURE 4.6 

Digital beamforming process. (From Godara, L.C., Application to antenna arrays to mobile communications. 
Part II: Beamforming and direction of arrival considerations, IEEE Proc., 85, 1195-1247, 1997. ©IEEE. With 
permission.) 


Assume that a beam is to be formed in direction <j) 2 . Let the direction be such that 

T 1 ( < t ) 2 ) = (l— 1)A (4.5.2) 

Thus, the signal from the 1th element needs to be delayed by (1 - 1)A seconds. This may 
be accomplished by summing the samples on a line marked with symbol A in Figure 4.6. 
For this case, the samples from Element 1 are not delayed, samples from Element 2 are 
delayed by one sample, and so on. 

Similarly, a beam may be steered in direction <0 by summing the samples connected by 
the line marked with symbol B in Figure 4.6, where the signals from Lth element are not 
delayed, samples from element L -1 are delayed by one sample, and so on. The beam 
formed in direction (jq, by summing the samples connected by the line marked with symbol 
C, does not require any delay. 

It follows from the above discussion that when using this process, one can only form 
beams in directions that require delays equal to some integer multiple of the sampling 
interval, that is. 


X-^) = kA (4.5.3) 

where k„ 1 = 1, 2, ..., L are integers. The number of discrete directions where a beam can 
be exactly pointed increases with increased sampling as shown in Figure 4.7, where the 
sampling interval is A/2. The figure shows that additional beams in directions <j) 4 and 
may be formed. These exact beams are normally referred to as synchronous or natural 
beams [Pri78], and it is possible to form a number of these beams simultaneously using 
a separate summing network for each beam. 
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FIGURE 4.7 

Effect of sampling on digital beamforming. (From Godara, L.C., Application to antenna arrays to mobile com¬ 
munications. Part II: Beamforming and direction of arrival considerations, IEEE Proc., 85,1195-1247,1997. ©IEEE. 
With permission.) 


The practical requirement of an adequate set of directions where simultaneous beams 
need to be pointed implies that the array signals be sampled at much higher rates than 
required by Nyquist criteria to reconstruct the wave form back from the samples [Pap75]. 
The high sampling rate means a large number of storage requirements along with high¬ 
speed input-output devices, analog-to-digital converters, and large bandwidth cables [Pri78]. 

The high sampling rate requirement may be overcome by digital interpolation [Pri78, 
Pri79, Syl86], which basically simulates the samples generated by high sampling rates and 
thus increases the effective sampling rate. The process works by sampling the array signal 
at a Nyquist rate or higher and padding with zeros between each sample to form a new 
sequence. The number of zeros padded decides the effective sampling rate. For the sam¬ 
pling rate to increase by L-fold, L - 1 zeros are padded to create a sequence as big as if it 
were created by high-speed sampling. The padded sequences then are used for digital 
beamforming by selecting appropriate samples as required and the beam output is passed 
through an FIR filter to remove the unwanted spectrum. This filter is normally referred 
to as an interpolation filter. The beams formed by interpolation beamformers have a 
slightly higher side-lobe level. 

A tutorial introduction to digital interpolation beamformers is given in [Pri78], whereas 
some additional fundamentals of digital array processing may be found in [Dud77]. A 
comparison of many approaches to digital beamforming implementations is discussed in 
[Muc84, Mar89], who show how a real-time implementation is a trade-off between various 
conflicting requirements of hardware complexities, memory requirements, and system 
performance. 

The shape of a beam, particularly its beam width, is controlled by the size of the array. 
Generally, a narrow beam results from a larger array. In practice, the array size is fixed 
and its extent is limited. A process known as extrapolation may be used [Fan84] during 
digital beamforming to simulate a large array extent resulting in improved beam pattern. 
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As the interpolation increases the effective sampling rate, the extrapolation extends the 
effective array length. More information on signal extrapolation schemes may be found 
in [Pap75, Sul91, Cad79, Son82, Jai81, Sna83]. 

Digital beamforming techniques for mobile satellite communications are examined in 
[Chu90] by studying a configuration of a digital beamforming system capable of working 
in transmit and receive modes. Digital beamforming for mobile satellite communications 
has also been reported in [Geb95, Chu90]. An introduction to digital beamforming for 
mobile communications may be found in [Ste87]. 


4.6 Frequency Domain Processing 

A general structure of the element-space frequency domain processor is shown in 
Figure 4.8, where broadband signals from each element are transformed into a frequency 
domain using the discrete Fourier transform (DFT), and each frequency bin is processed 
by a narrowband processor structure. The weighted signals from all elements are summed 
to produce an output at each bin. The weights are selected independently by minimizing 
the mean output power at each frequency bin subject to steering direction constraints. 
Thus, the weights required for each frequency bin are selected independently and this 
selection may be performed in parallel, leading to faster weight update. When an adaptive 
algorithm such as the LMS algorithm is used for weight updating, a different step size 
may be used for each bin leading to faster convergence. 



FIGURE 4.8 

Frequency domain processor structure. 
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Various aspects of array signal processing in a frequency domain are reported in the 
literature [Hod79, Arm74, Den78, Nar81, Web84, Shy85, FI 088 , Ber 86 , Ree85, Man82, 
Kum90, Cla83, Zhu90, God95, Hin81]. The optimum performance of the time domain and 
frequency domain processors are the same only when the signals in various frequency 
bins are independent. This independence assumption is mostly made in the study of 
frequency domain processing. When the assumption does not hold, the frequency domain 
processor may be suboptimal. Some of the tradeoffs and a comparison of the two proces¬ 
sors are discussed in [Hod79, God95]. 

A study of the frequency domain algorithm [Web84] for coherent signals indicates that 
the frequency domain method is insensitive to the sampling rate, and may be able to 
reduce the effects of element malfunctioning on the beam pattern. A study in [Shy85] 
shows that due to its modular parallel structure, beam forming in the frequency domain 
is well suited for VLSI implementation and is less sensitive to the coefficient quantization. 
Computational advantages of the frequency domain method (FDM) for bearing estimation 
are discussed in [Ree85, Kum90, Hin81], and for correlated data are considered in [Man82, 
Zhu90]. A general treatment of time and frequency domain realization with a view to 
compare the structure of various algorithms of weight estimation in a unified manner is 
provided in [God95]. 

In this section, frequency domain processing is studied in detail using a constrained 
element space processor, and relationships between the time domain processor and the 
frequency domain processor are established [God95]. 


4.6.1 Description 

Consider an L-element array immersed in a noise field consisting of uncorrelated broad¬ 
band directional sources and white noise. Let s(t) be a broadband real signal, with the 
power spectral density S(f) induced on a reference element due to a source. The autocor¬ 
relation function 


p(x) = E[s(t)s(t + x)] (4.6.1) 

is the inverse Fourier transform of S(f), that is. 


P(x)=| S(t)e i2ltfr dt 


(4.6.2) 


Let x,(t) denote the time wave form derived from the 1th element after presteering. Let 
these wave forms be sampled at frequency f s . Denoting the sampling interval by T, the 
sampled wave form derived from 1th element becomes x,(nT). As the sampling period 
does not play any role in the treatment that follows, it has been omitted for ease of notation. 
Let x(n) denote the L samples after presteering delays, that is, 

x(n) = [xj, (n), x 2 (n),..., x L (n)] T (4.6.3) 

Now consider N array samples x(n - i + 1), i = 1, ..., N, with x(n) denoting the most recent 
samples. Let these be processed by the frequency domain processor structure shown in 
Figure 4.8, where these are first converted into N frequency bins using discrete Fourier 
transforms and then processed using N narrowband processors. 
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Let y(k) denote the output of the kth bin. From Figure 4.8, it follows that 

y(k) = h H (k)x(k) (4.6.4) 

where an L-dimensional complex vector h(k) denotes the L weights of the narrowband 
processor for the kth bin, that is. 


h(k) = [h 1 ,(k),...,h L (k)f 


(4.6.5) 


with h, (k) denoting the weight on the 1th channel. 

The L-dimensional complex vector x(k) denotes the L-frequency domain samples, that is, 

x(k) = [x 1 (k), ...,x L (k)] T (4.6.6) 


with x,(k) denoting the frequency domain samples from the 1th channel. The N frequency 
samples of the 1th channel x,(k), k = 0,1, ..., N - 1 are related to the N time samples x,(n), 
n = 1, 2, ..., N by the discrete Fourier transform [Bur85], that is, 

N _.2jt 

^i(k) = ^x^e" 1 ^, k = 0,1,..., N-l (4.6.7) 

i=l 

where X]j= x,(n - i + 1), i = 1, 2, ..., N and x rl = X[(n) denotes the most recent sample. 

Thus, using N array samples x(n - i + 1), i = 1,2,..., N, the frequency domain processor 
produces N frequency domain outputs y(k), k = 0, 1, ..., N- 1. These are converted into 
N output time samples y(n - i + 1), i = 1, 2, ..., N using the inverse DFT, that is, 

y(n-i + l) = —^y(k)e’ N ‘ (4.6.8) 

N k=0 


where y(n) denote the most recent output. 

The most recent output corresponds to i = 1 in the LHS of (4.6.8). Thus, it follows from 
(4.6.8) that 


k=0 


(4.6.9) 


Thus, the most recent output sample may be obtained by averaging the output of N 
narrowband processors without computing N-point inverse DFT. This aspect is exploited 
in sliding window processing, where N most recent input samples are converted into 
frequency domain using DFT, and the time domain output is obtained by averaging the 
N outputs. In this scheme, every time a new input sample arrives, a full cycle involving 
conversion to frequency domain using DFT, narrowband processing, and computation of 
output using (4.6.9) needs to be carried out. 
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The other processing scheme discussed previously where N input time samples are 
collected, converted to the frequency domain, processed using N narrowband processors, 
and converted back to N output time samples using the inverse DFT is referred to as block 
processing [Com88]. Thus, in summary, in block processing a block of N input samples 
is collected to be processed using narrowband processing to obtain N output time samples. 
On the other hand, in sliding window processing every time a new sample arrives, the 
complete processing cycle is invoked. The difference in the processing cycle for the two 
schemes is that the sliding window processing does not use the inverse DFT. 

In both cases, once the N time samples are converted into N frequency domain samples, 
any of the narrowband processing schemes discussed in previous chapters may be used. 
In the next section, the relationship between the frequency domain processing discussed 
in this section and the time domain processing using the TDL structure discussed earlier 
is established. 


4.6.2 Relationship with Tapped-Delay Line Structure Processing 

Assume that the N array samples x(n - i + 1), i = 1,2,..., N are processed by two processor 
structures, namely, the TDL structure shown in Figure 4.1 where the processing is carried 
out in the time domain and frequency domain processor structure shown in Figure 4.8 
where the processing is carried out in frequency domain. In the following, the conditions 
are derived for the two processors to produce identical outputs. 

4.6.2.1 Weight Relationship 

The output of the time domain processor shown in Figure 4.1 is given by 

y(n) = W T X(n) (4.6.10) 

where W is defined in (4.1.4) and X(n) is defined in (4.1.6) with t replaced by n. Rewrite 
(4.6.10) as 

L J 

y( n )=^ 

1=1 m=l 

(4.6.11) 

=LE w ^ x i 

1=1 m=l 


It follows from (4.6.11) that the output at time n depends on the present input x,(n) and 
J - 1 previous inputs, namely, x,(n - 1), ..., x,(n - J + 1). Thus, for a given set of N samples 
under consideration, one is able to obtain only N - 0 + 1) output samples, namely y(n), 
y(n - 1), ..., y(n - N + J). This implies that for J = N, these samples only produce one 
output sample, given by 


L N 

y( n )=^^ w lm x M 1 

1=1 m=l 


(4.6.12) 


Now, consider the frequency domain processor processing the same N samples. The 
most recent time sample for the frequency domain processor is given by (4.6.9). For the 
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two processors to produce identical outputs, the time samples given by (4.6.9) and (4.6.12) 
must be equal, that is. 


EE' 


1 N - 1 

wEt< k > 

k=0 


Rewrite (4.6.4) as 


L 

y(k) = ^h*(k)x 1 (k), k = 0,1,..., N-l 

1=1 


(4.6.13) 


(4.6.14) 


It follows from (4.6.13), (4.6.14), and (4.6.7) that 


L N N-l L 

1=1 m=l N k=0 1=1 i=l 


L 

EE' 


N 2 k 

x n e N 


L N .-5V 


(4.6.15) 


EE x » NE h ' (k > 


k=0 


The identity holds if 


1 Nn _.2jt 

w im = ^2^ h i( k ) e ,N m , 1=1, 2,..., L, m = l, 2,..., N 

N k=0 


(4.6.16) 


Thus, 


w ]m ,rn = l,2,...,N = DFT| ll ^,k = 0,l,...,N-l| (4.6.17) 

It follows then that both processors produce identical outputs when the TDL structure 
has length equal to N and the two sets of weights are related by (4.6.16). 

4.6.2.2 Matrix Relationship 

Consider the output sequence of the frequency domain structure of Figure 4.8. Assume 
that M 0 sets, each of N samples, are being processed. Let y(k, m) denote the output of the 
kth frequency bin due to the mth data block. For a given h(k), the mean output power of 
the kth bin is given by 


P(k) 


1 

M, 


0 m=0 


y(k,m)y*(k,m) 


Following (4.6.4), the output of kth bin due to mth set is given by 


(4.6.18) 
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y(k,m) = h H (k)x(k,m) 


(4.6.19) 


This along with (4.6.18) implies that 

P(k) = h H (k)R f (k)h(k) (4.6.20) 

where 

1 M 0 -l 

R f (k)= —^x(k / m)x H (k / m) (4.6.21) 

^ m=0 

is an estimate of the array correlation matrix for the kth bin. 

It follows from (4.6.21) that 

Mo-i 

( R f( k ))i n = J £Xi(k, m )x*(k,m) (4.6.22) 

^ m=0 


Since 


Vh -'—(i 

i( k ' m )=22 x ii( m ) e ,N 1 


—l)k 


it follows from (4.6.22) that 


M 0 —1 N 


271,. ... N 


( R f ( k )) ^ n = ^ £ £x ]i ( m )e- )N<1 - 1)k £x ni ( m ) e 


jf(i-D k 


(4.6.23) 


0 m= o i=l 


Note that x, is a real variable and x is a complex variable. Define an N-dimensional vector 
x, (m) representing N samples in the tapped delay line structure on the 1th channel as 


, 1= 1, 2,..., L 


x i(m) = 

and an N-dimensional vector e(k) representing N phasers at kth bin as 


M 

x inM 


(4.6.24) 


e(k) = 


M 

.271 , 


M 


jf(N-l)k 


(4.6.25) 
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From (4.6.23) to (4.6.25) it follows that 


where 


Mq-I 

( R f( k ))l,n = 52 eH ( k ) X l( m K( m M k ) 


^ m=0 


= e 


I (k)(R 1 , n )e(k) 


Mq-1 

W=^£ Xi(m K( m) 


o m=0 


(4.6.26) 


(4.6.27) 


is an N x N matrix denoting the correlation between the 1th and nth elements for the 
tapped delay line structure, estimated from M 0 sets of samples, each of length N. It is an 
unbiased estimate for the correlation between the 1th and nth elements for given M 0 
samples. As M 0 increases, the estimate asymptotically approaches the true correlation. 
Therefore, the relationship between the frequency domain and time domain matrices holds 
for the true correlation matrices. 

Throughout the chapter, R f and R are used to denote the frequency domain and time 
domain array correlation matrices, respectively, as well as their unbiased estimates. Fur¬ 
thermore, the correlation between the mth and nth taps is denoted by the matrix (R m/1 ), 
and the correlation between 1th and ith elements is denoted by the matrix (R,;). 


4.6.2.3 Derivation of Rf(k) 

Let (R mn )| j denote the correlation between 1th and ith elements after mth and nth taps due 
to a source in direction (c|),0). An expression for (R^u from (4.1.11) is given by 

K,n) ;i = p[(m - n)T + T 1 -T |+ X 1 -T 1 ] (4.6.28) 

where the arguments (]) and 0 have been suppressed for the ease of notation. 

As the correlation function is symmetrical for real signals, it follows from (4.6.2) and 
(4.6.28) that 


KJ U =I S » 


|e -j2ltfT(m-n) e j2rf(T i -T 1 ) e j2ltf(T 


'df 


(4.6.29) 


Define an N-dimensional vector e(f) denoting N phasers at frequency f as 


e (f) 


1 

e -j2rfT 

M 

e -j2rfT(N-l) 


(4.6.30) 


It follows from (4.6.29) that the N x N matrix denoting the correlation between 1th and 
ith elements is given by 
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(4.6.31) 


(R 14 ) = J S(f)e(f)e H (f)e j27tf(Tl “ Tl “ Tl+Tl) df 

Equation (4.6.31) along with (4.6.26) implies that 

(R £ (k)) u = e H (k)(R 14 )e(k) 


Js(f)a(f,k> 


( j 2 nf (x 1 -T 1 -x i +Tj) 


where 

a(f,k) = e H (k)e(f)e H (f)e(k) 
Substituting for e(f) and e(k), (4.6.33) becomes 


sin 2 7T| 


a(f,k) = 


f f ^ 

k + — N 

V 5 J 

t 


sm" 


N 


f 


k + -N 

V h J 


(4.6.32) 


(4.6.33) 


(4.6.34) 


Using steering vector notation, one obtains from (4.6.32) the following compact expres¬ 
sion for Rf(k): 


R f (k) = J S(f )a(f, k)S(f, <j), 0)S H (f, 0)df 


(4.6.35) 


where S(f,<|),0) denotes the steering vector in ((]),0) direction for an array presteered in (<|) (> 0 O ). 

4.6.2.4 Array with Presteering Delays 

Noting that the steering vector in (<t»o,0o) direction for an array presteered in (c|)o,0o) is 
identical to 1, it follows from (4.6.33) and (4.6.35) that the matrix Rf(k) due to a source in 
a presteered direction is given by 


where 


R f (k) = a(k)ll T 


(4.6.36) 


a(k) = e H (k) 


J S(f)e(f)e H (f)df 


e(k) 


(4.6.37) 


The matrix in the square brackets on the right side of (4.6.37) is a spectrum-dependent 
quantity. Let it be denoted by A. Its (m,n)th element A mn is given by 
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(4.6.38) 


A m , n = Js(f)e-i 2 « T df 


A m/1 can be evaluated for a specific spectrum using (4.6.38). For example, for a brick-wall 
type of spectrum given by 


it becomes 



otherwise 


(4.6.39) 


sin27rf H (m-n) sin27rf L (m-n) 
0 27i(m-n) 27t(m-n) 


(4.6.40) 


where f H and f L are assumed to be normalized with respect to the sampling frequency. 


4.6.2.5 Array without Presteering Delays 

For this case, the steering delays Tj= 0, i = 1, 2, ..., L. Thus, it follows from (4.6.32) that 


R f (k) = J S(f )a(f, k)S(f ,(|),e)S H (f, (]), 0)df 


(4.6.41) 


where S(f,4>,0)) denotes the steering vector in (4>,0) direction for an array without prest¬ 
eering. Note that this matrix in general is not equal to a matrix that depends on the energy 
from the kth bin only, namely 


(k+l)Af 

R f (k) = f S(f)S(f,^),e)S H (f,(^,e)df (4.6.42) 

kAf 

with At = 1 /N denoting the bandwidth of a frequency bin. 

4.6.2.6 Discussion and Comments 

The results presented here show that when a broadband correlation matrix is transformed 
into narrowband matrices, these matrices depend on the spectrum of the signal beyond 
the bandwidth of their particular frequency bins, which is controlled by the parameter 
a(f,k) given by (4.6.34). Figure 4.9 and Figure 4.10 show how this parameter behaves as a 
function of the frequency for N = 10 and N = 100, respectively. The plots are for k = 0, 
and show the normalized value of the parameter with respect to its maximum value N 2 . 


4.6.3 Transformation of Constraints 

As discussed in Section 4.3, the weights of the broadband element space processor using 
TDL are subjected to various constraints to make the processor robust against various 
uncertainties. In this section, some of these constraints are transformed for narrowband 
processors operating in the frequency domain. 
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FIGURE 4.9 

The parameter a(f,k) defined by (4.6.33), normalized with respect to its maximum value, vs. frequency for N = 
10 and k = 0. (From Godara, L.C., Application of the fast Fourier transform to broadband beamforming, /. Aconst. 
Soc. Am., 98, 230-240, 1995. With permission.) 


4.6.3.1 Point Constraints 

Assume that the weights of the TDL are constrained in the look direction, such that 

L 

£^=4, m = l, 2,..., N (4.6.43) 

1=1 


where f m , m = 1, 2, ..., N specifies the frequency response of the processor in the look 
direction as discussed in Section 4.1.2. Note that (4.6.43) is obtained by rewriting (4.1.24) 
with J replaced by N. 

Summing on both sides of (4.6.16) over 1, 

L N-l L 2 K 

E w - = n LL h ‘ (k)ev ' m=2 . N < 4 - 6 - 44 ) 

1=1 ^ k=0 1=1 

This along with (4.6.43) implies that 

N-l L 2ti 

f - N IL h;(k)e " N m = 1 ' 2 ' N (46 ' 45) 

^ k=0 1=1 
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FIGURE 4.10 

The parameter a(f,k) defined by (4.6.33), normalized with respect to its maximum value, vs. frequency for N = 
100 and k = 0. (From Godara, L.C., Application of the fast Fourier transform to broadband beamforming, /. Acoust. 
Soc. Am., 98, 230-240, 1995. With permission.) 

Taking the inverse DFT on both sides, after rearrangements 

I. I- 27t 

£h;(k)=£^ k = 0,1,..., N-l (4.6.46) 

1=1 m=l 


Thus, the equivalent constraints on the weights of the kth bin processor are given by 

h H (k)l = f k , k = 0,1, 2,..., N-l (4.6.47) 

where f k specifies the constraint on the weights of the kth bin processor. It follows from 
(4.6.46) that 

j—(m-l)k 

4 = £f m e N (4-6.48) 

m=l 

Thus, f k , k = 0, 1, 2, ..., N - 1 are the coefficients of inverse DFT of Nf nv m = 1,2, ..., N. 

4.6.3.2 Derivative Constraints 

The derivative constraints for the broadband processor are discussed in detail in Section 
4.3. These are imposed alongside the point constraints to broaden the beamwidth, which 
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helps to overcome the pointing errors. First-order constraints are given by (4.3.18) and 
(4.3.19). Rewriting, 


iWV e o) w m = 0 ' m = l,2,...,N 


and 


l T A e (^ 0 ,e o )w m =0, m = 1, 2,..., N 
where A^tf),©) and A e ((]),0) are diagonal matrices given by 


A*M) = 


3Ti(^,0) 


and 


\M = 


3 (|) 

0 


30 

0 


Rewrite (4.6.16) in vector notation as 


N-l 


W = 


N 


£h*(k> 


(m-l)k 


3 (|) 


0 

d\M) 

30 


, m = l, 2,..., N 


k=0 


Substituting in (4.6.49) and (4.6.50), 


(4.6.49) 


(4.6.50) 


(4.6.51) 


(4.6.52) 


(4.6.53) 


N 


N-l 2 jc 

£l T A^(t,,0o)h^(k)e _) N (m - 1)k 

k=0 


= 0, 


m = 1, 2,..., N 


(4.6.54) 


and 


^£l T A 9 (^ o ,0 o )h^(k)e _i N (m - 1)k = O, 

N k=0 


m = 1, 2,..., N 


(4.6.55) 


Taking the inverse DFT on both sides of (4.6.54) and (4.6.55), the following equivalent 
constraints on the narrowband weights result: 


l T A^(^ 0 ,e 0 )H-(k) = 0, k = 0,1,..., N-l 


(4.6.56) 
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and 


l T A e (^ 0/ e o )h*(k) = 0, k = 0,1,..., N -1 (4.6.57) 

Alternatively, these may be expressed as 

h H (k)A^ o ,0 o )l = O, k = 0,1,..., N-l (4.6.58) 

and 

h H (k)A e (^ o ,0 o )l = O, k = 0,1,..., N-l (4.6.59) 

Following a similar procedure, the second-order derivative constraints for the weights 
of the broadband processor given by (4.3.20) to (4.3.25) can be transformed for the weights 


of the narrowband processors. These are given by 

% c)A a ((]),9) 

h H (k)— 1 = 0, k = 0,1,..., N-l (4.6.60) 

o<|) 

(<Mo) 

h H (k)A 2 ^(c|) o ,0 o )l = 0, k = 0,1,..., N-l (4.6.61) 

h H (k) 3A e^ 9 ) 1 = 0/ k = 0,1,..., N-l (4.6.62) 

( < t > 0 ' 9 o) 

h H (k)A 2 e ((|) o ,0 o )l = 0, k = 0,1,..., N-l (4.6.63) 

.. 3A^((f), 0) 

h H (k)— 1 = 0, k = 0,1,..., N-l (4.6.64) 

o0 

(4>o< 9 o) 

and 

h H (k)A^(^,0 o )A e (^,0 o )l = O, k = 0,1,..., N-l (4.6.65) 


4.7 Broadband Processing Using Discrete Fourier Transform Method 

In the previous section, an FDM to process broadband signals was discussed in which 
broadband time domain data are transformed into narrowband frequency domain data 
using DFT, and are then processed using narrowband processing schemes. The processed 
signals are transformed into broadband time domain signals using the inverse DFT. Thus, 
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the implementation is done using narrowband processors operating at different frequency 
bins. 

In contrast, in the time domain method (TDM) discussed in Section 4.1.3, the processor 
is implemented in a time domain using a TDL structure, as shown in Figure 4.1. The 
weights of the broadband processor are obtained by solving the constrained beamforming 
problem when the look direction information is available. 

In this section, the DFT method for estimating the weights of the broadband processor 
using a TDL structure of Figure 4.1 is discussed, and the performance of the broadband 
processor using the DFT method is compared with that using the time domain method 
[God99]. The method is discussed by considering the beamforming problem with the point 
constraint. In this case the TDM solves the following beamforming problem: 

minimize W T RW 

W (4.7.1) 

subject to C T W = f 

where C is the constraint matrix defined in (4.1.26) and f is a J-dimensional vector selected 
to specify the frequency response in the look direction. The weights W estimated by the 
TDM are the solution of (4.7.1), and are given by 

W = R‘ 1 C(C t R“ 1 C)“ 1 C T f (4.7.2) 

The DFT method estimates the weights of the broadband processor of Figure 4.1 in two 
steps. First, it estimates the weights of narrowband processors by minimizing the mean 
output power of each frequency bin, and then uses the relations developed in the last 
section between the time domain and frequency domain structures for identical outputs 
to transform these into the required weights. It also maintains the same frequency response 
in the look direction as is done by the TDM using the appropriate constraints developed 
in the last section. Figure 4.11 shows a schematic diagram of the DFT method. 



FIGURE 4.11 

Schematic dagram of broadband processor using DFT methods. 
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The similarity between the TDM and DFT methods is that both estimate the weights of 
the TDL structure of Figure 4.1. The main difference between the two is that this method 
minimizes the mean output power of each frequency bin, rather than minimizing the mean 
output power of the processor, as is done by the TDM. This implies that if the sum of the 
mean output powers from all frequency bins is not equal to the mean output power of 
the processor shown in Figure 4.1, then the realized processor using the DFT method does 
not maximize the mean output SNR in the absence of errors, as is case with the processor 
using the TDM to estimate the weights. However, this method offers the potential for a 
large amount of computational savings for real-time applications due to its parallel nature 
of implementation as discussed later in this section. 

As the DFT method minimizes the mean output power of each frequency bin and then 
uses the relations between the time domain and the frequency domain structures for the 
identical outputs, the performance of the realized processor in the absence of implemen¬ 
tation errors is the same as the processor implemented in the frequency domain. However, 
there are important differences. 

The main difference between the DFT method and the FDM is that this method uses 
the optimized weights of the narrowband processors operating at different frequency bins 
to estimate the optimal weights of the time domain broadband processor. The processor 
is implemented in the time domain and the received signal flows in the time domain 
structure without encountering the delay associated with the frequency domain imple¬ 
mentation. This may be important for some applications. 

As broadband processor performance using the DFT method to estimate the weights 
when implemented in the time domain is identical to that implemented in the frequency 
domain, this fact presents a framework for comparing the performance of time domain 
and frequency domain implementations under identical conditions. 


4.7.1 Weight Estimation 

The DFT method uses the following procedure to estimate the weights of the time-domain 
broadband-constrained processor using a TDL structure of length J. 

1. Estimate narrowband array correlation matrices Rf(k), k = 0, ..., J - 1 using 

R f (k) u = e H (k)(R li )e(k), l,i = l,...,L (4.7.3) 


where 


e(k) 



e J 


T 


(4.7.4) 


and (R, j) is a J x J matrix denoting the correlation between samples from 1th and 
ith elements given by (4.6.27). 

2. Estimate h(k), k = 0, ..., (J - 1) using 


h(k) 


M 

l T R^(k)l 


(4.7.5) 
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which are the solutions of the following narrowband beamforming problems: 


where 


minimize h H (k)R f (k)h(k) 

h(k) k = 0,J-l 

subject to h H (k)l = f k 



J -27T, 

f m e J 

m=l 


k = 0, 


J-l 


(4.7.6) 


(4.7.7) 


Equation (4.7.7) ensures that the required frequency response in the desired direc¬ 
tion is maintained. It should be noted that due to the symmetry property of the 
Fourier transform, one only needs to estimate J narrowband weights h(k), k = 0, 
..., (J - 1), where 


fj + l 


2 ' 


when J is odd 
when J is even 


(4.7.8) 


3. Estimate the weights of the time domain structure of Figure 4.1 using 


w„, = 


i J - 1 

E 

1 k=0 


-i—(m-l)k 

h^kje J , 


m = l,1=1, ...,L 


(4.7.9) 


The block diagram shown in Figure 4.12 summarizes the method to estimate the weights 
of the broadband processor using the proposed technique. 


4.7.2 Performance Comparison 

In this section, examples are presented to compare the output SNR of the processor using 
the weights estimated by the DFT method and the TDM. The weights for the TDM are 
computed using (4.7.2), whereas for the DFT method, they are computed using (4.7.3) to 
(4.7.9). Both methods use actual LJ x LJ dimensional array correlation matrix R, and 
produce LJ weights of the TDL structure. 

A linear array of equispaced elements is used in the presence of one interference source. 
The element spacing is measured in wavelengths of the desired signal at the highest 
frequency. The signal bandwidth is expressed in terms of the normalized frequency with 
respect to sampling frequency. The sampling frequency is taken to be equal to twice the 
highest frequency of the desired signal. Thus, the normalized highest frequency of the 
desired signal is identical to 0.5. All directional sources are assumed to be of the brick- 
wall type spectrum. The directional sources considered for the study are assumed to be 
of two bandwidths, referred to as the large bandwidth and the small bandwidth. The 
normalized frequency band for the large bandwidth is from [0.15, 0.5], whereas for the 
small bandwidth it is [0.45, 0.5]. The desired signal of unit power is assumed to be present 
broadside to the array. 
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FIGURE 4.12 

Block diagram of DFT method. 

The output SNR is computed using 


SNR = 


W h R s W 

W h R n W 


with R s denoting the actual array correlation matrix due to signal only, and R N denoting 
the actual array correlation matrix due to interference and background noise only. The 
SNR is plotted as a function of the angle of the interference by varying it from 0° to 180°. 
The array is constrained to have the all-pass response in the desired signal direction by 
selecting 


£ = 


1 


i = 


J + l 


0 otherwise 


(4.7.10) 


where the filter-length parameter J is assumed to be an odd integer. 

The performance comparison is carried out by varying the length of the filter, number 
of elements in the array, the signal bandwidth, and interference-to-background-noise ratio 
to see how various parameters affect the result. 


4.7.2.1 Effect of Filter Length 

In order to compare the performance of the two methods for a different number of taps, 
a five-element array is used in the presence of a directional interference of power 10 dB 
above the signal level and the white noise power 10 dB below the signal level. Figure 4.13 
shows SNR-|(dB) - SNR D (dB), or equivalently, 101og 10 (SNR T /SNR D ), as a function of inter¬ 
ference angle for various filter lengths with SNR X and SNR;,, respectively, denoting the 
output SNR of the processor using the TDM and DFT methods. 
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(a) (b) 


FIGURE 4.13 

10 log(SNR T /SNR D ) vs. interference angle with number of elements = 5, signal power = 1.0, interference power = 
10.0, and white noise power = 0.1. (a) Large bandwith. (b) Small bandwidth. (From Godara, L.C. and Jahromi, 
M.R.S., IEEE Trans. Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 


The two bullets on each curve indicate the beamwidth of the antenna array used. An 
expression for the beamwidth of the main lobe for the narrowband array with a large 
number of elements is given by [Col85] 


Beamwidth = 2arcsin 



where A, 0 is the wavelength of the narrowband signal, and d is the element spacing in 
meters. Beamwidth for the broadband arrays has been taken to be the average of the two 
beamwidths computed at the lowest and the highest frequencies of the signal. 

Figure 4.13 shows that the difference between the two SNRs is smaller when the inter¬ 
ference is outside the main lobe than the case when it is within the main lobe, except when 
the interference is close to the look direction, in which case the processor generally is not 
used for its interference canceling capability and thus the situation is of no practical 
significance. 

Above certain values of filter length, the results for the two bandwidths are different. 
For a small bandwidth signal, the difference between the two SNRs is very small when 
the interference is outside the main lobe, whereas it is reasonably high when it is within 
the main lobe, except when the interference is close to the look direction. 

In the case of large bandwidth signals, the difference between the two output SNRs 
does not become as small as that for the small bandwidth case when the interference is 
outside the main lobe. The difference is more sensitive to the filter length above certain 
values for the large bandwidth case compared to the small bandwidth case and decreases 
as the filter length is increased. 


4.7.2.2 Effect of Number of Elements in Array 

For this example, the array element numbers are varied to study their effect on the 
performance difference of the two methods. Figure 4.14 shows the difference in the two 
SNRs for the both bandwidth sources. When the interference is outside the main lobe, an 
increase in the number of elements causes a decrease in the difference between the SNRs 
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(a) (b) 

FIGURE 4.14 

10 log(SNR T /SNR D ) vs. interference angle with J = 15, signal power = 1.0, interference power = 10.0, and white 
noise power = 0.1. (a) Large bandwith. (b) Small bandwidth. (From Godara, L.C. and Jahromi, M.R.S., IEEE 
Trans. Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 

obtained by the two methods. This implies that increasing the number of elements 
improves the output SNR of the DFT method more than that of the TDM. Thus, when the 
interference is away from the look direction, the output SNR achievable by the DFT method 
approaches that of the TDM as the number of elements are increased. It should be noted 
that an increase in the number of elements in the array causes a decrease in the array 
beamwidth. Thus, as the number of elements in the array increases, the sector outside the 
main lobe increases. When an interference is present in this sector, the difference in the 
two SNRs is small. 

Figure 4.14 also shows that the maximum value of the difference between the SNRs of 
the two methods increases as the number of elements in the array is increased. Further¬ 
more, the direction of interference where the maximum difference between the SNRs 
occurs moves closer to the look direction as the number of elements is increased. Thus, it 
means that as the number of elements is increased, the interference canceling capability 
of the DFT method decreases relative to the TDM when the interference is close to the 
look direction. 

This is a very interesting result. It says that the interference-canceling capability of the 
DFT method decreases, as the interference is very close to the look direction. In practice, 
a situation in which interference is close to the look direction rarely occurs, and even if it 
did, the interference-canceling capability of a processor is low for all practical purposes. 
However, in the presence of the look direction error, situations do occur when the desired 
source is not in the look direction and a processor treats it as interference. Extra precautions 
are necessary to overcome such situations. It appears from these results that the DFT 
method provides this beam-broadening capability naturally. This aspect of the DFT method 
is further explored in a later section to show that it is robust against look direction errors. 

4.7.2.3 Effect of Interference Power 

Figure 4.15 shows the difference in SNRs achievable by the TDM and DFT methods for 
various interference power levels at a given background noise. This figure shows that the 
performance of the processor using the DFT method deteriorates relative to the one using 
the TDM as interference power increases. This is true for small as well as large bandwidth 
signals. However, the deterioration is comparatively low when the interference is outside 
the main lobe. For the small bandwidth case, it is hardly noticeable. 


© 2004 by CRC Press LLC 
































(a) (b) 


FIGURE 4.15 

10 log(SNR T /SNR D ) vs. interference angle with number of elements = 10. J = 15, signal power = 1.0, and white 
noise power = 0.01. (a) Large bandwith. (b) Small bandwidth. (From Godara, L.C. and Jahromi, M.R.S., IEEE 
Trans. Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 


4.7.3 Computational Requirement Comparison 

In this section, examples are presented to compare the two methods based on their 
computational requirements to estimate weights of the TDL filter once the time domain 
array correlation matrix has been computed. The computation count reflects the floating¬ 
point operations required for weight estimation. Denoting the computation count for the 
TDM and the DFT method by O t and C)|> respectively, one obtains from (4.7.2) and (4.7.3) 
to (4.7.9) that 


O x = 2J 3 (l 3 + L 2 + 2L +1) + 2LJ 2 (4.7.11) 

and 

O d = 4J 3 L 2 + 4J 2 L 2 + 4JL 3 + 8J 2 L + 4JL 2 + 10JL (4.7.12) 

It should be noted that no allowance has been made in either of the methods for any 
special matrix structure that might be used to reduce computation count. 

Figure 4.16 shows the ratio of the floating-point operation for the TDM to the DFT 
method, Oj/Oq, as a function of the filter length for a varying number of elements. The 
TDM requires more computation than the DFT method, and a reduction of the order of 
50 is possible using an array of 100 elements with a tapped delay line filter of length 100. 
It should be noted that an increase in filter length does not increase the computational 
savings as much as that achievable by increasing the number of elements. This is also 
evident from approximations of O x and O d for large J and L. Approximating (4.7.11) and 
(4.7.12) for large J and L lead to 


O t =2J 3 L 3 (4.7.13) 

and 

O d =4J 3 L 2 (4.7.14) 
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FIGURE 4.16 

Ratio of the required floating point operations using the time domain method to DFT method (O x /O d ) vs. 
number of taps (J). (From Godara, L.C. and Jahromi, M.R.S., IEEE Trans. Signal Process., 47. 2386-2395, 1999. 
©IEEE. With permission.) 


It follows from (4.7.13) and (4.7.14) that 



Thus, it follows that the reduction of the order of L/2 is possible using the DFT method. 

4.7.4 Schemes to Reduce Computation 

In this section, a number of schemes are discussed to reduce the computational require¬ 
ments for weight estimation using the DFT method. 

4.7.4.1 Limited Number of Bins Processing 

The DFT method basically divides the entire spectrum into a number of frequency bins 
and processes signals in each bin. The weights at each bin are selected by minimizing the 
mean output power of each bin subject to constraints. In practice, the processing of all 
bins is not necessary, as the desired signal only covers a part of the spectrum, and thus 
one is only interested in canceling the interference that overlaps the signal bandwidth. 
Hence, one only needs to select weights by minimizing the mean output power of those 
bins that are in the vicinity of the signal bandwidth. The weights for bins outside this 
range may be selected to provide the maximum SNR under no directional sources. The 
conventional processor maximizes the output SNR in the absence of a directional source 
environment. Thus, selecting the weights of the antenna array is done by solving the 
optimal beamforming problem for those bins in the vicinity of the signal bandwidth and 
using equal weighting for other bins. 

Since the equal weighting process does not require any computation, processing a 
limited number of bins reduces the computation load substantially, depending on the 
signal bandwidth. Computer analyses have shown that good results are obtained by 
processing two extra bins, one on each side of the signal bandwidth. Let J denote the 
number of bins in the vicinity of the signal bandwidth that need to be processed by solving 
the optimization problem. Thus, J is given by 
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FIGURE 4.17 

SNR improvement using the bin elimination method compared to the DFT method vs. interference angle with 
number of elements = 5, J = 101, signal power = 1.0, interference power = 10.0, and white noise power = 0.1. 
(From Godara, L.C. and Jahromi, M.R.S., IEEE Trans. Signal Process., 47,2386-2395,1999. ©IEEE. With permission.) 


- _ |f(j + 1 ) B s "| + 2 J is odd 
^ j|"jB s ~| + 2, J is even 


(4.7.16) 


where [x] denotes an integer greater than or equal to x, and B s denotes the normalized 
signal bandwidth, that is. 


B s =(f H -f L )/4 (4-7.17) 

In order to illustrate the computational efficiency and performance improvement pro¬ 
vided by this scheme, consider the parameters of Figure 4.17. For this case, the bin elim¬ 
ination scheme requires the processing of eight bins for the small bandwidth signal and 
38 bins for the large bandwidth case, compared with 51 bins by the normal DFT method. 
The floating-point operations required to process these bins reduce to 16% and 75% of the 
normal DFT method for the two cases, respectively. 

Figure 4.17 shows the improvement in output SNR using this method compared with 
the normal DFT method, which processes all the bins. The SNR improvement is evident 
for all interference directions. Thus, the processor using this method not only requires less 
computation time but also attains higher output SNR compared to the normal DFT 
method. 

4.7.4.2 Parallel Processing Schemes 

It is possible to increase the computation speed of the FDM by carrying out many com¬ 
putations in parallel. Hardware complexity and, thus, system cost, is expected to increase 
as more and more parallel processing is carried out to increase processing speed. Thus, 
there is a tradeoff between speed, which is vital in real-time operations, and system 
hardware cost. In this section, selected schemes are discussed, and their computational 
requirements are compared with the TDM. 

4.7.4.2.2 Parallel Processing Scheme 1 

A block diagram showing the steps involved in this scheme to estimate the weights is 
shown in Figure 4.18. The scheme processes all frequency bins in parallel. The number of 
bins J required to be processed for a J-tap filter is given by (4.7.8). 


© 2004 by CRC Press LLC 













R 



FIGURE 4.18 

Block diagram of Parallel Processing Scheme Number 1. (From Godara, L.C. and Jahromi, M.R.S., IEEE Trans. 
Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 

It should be noted that when weights are estimated for real-time operations, the time 
taken by the processor is an important measure of its performance, and the parallel 
processing scheme minimizes this time. Let O d1 denote the computation count that reflects 
this fact, and which represents the time taken to estimate the weights rather than to 
measure total computation requirements. Then, the number of floating-point operations 
O d1 required to estimate the weights is given by 

O m = 8J 2 L 2 + 8JL 2 + 8L 3 + 8J 2 L + 8L 2 + 20L (4.6.18) 

4.7.4.2.2 Parallel Processing Scheme 2 

This scheme not only processes all frequency bins in parallel but carries out matrix 
multiplications in parallel. Computation of each element of matrix R f (k) requires the 
following operation: 

R f (k) li = e H (k)(R li )e(k), l,i = l,...,L (4.7.19) 

The scheme carries out multiplication of e(k) with each column of (R,;) in parallel to reduce 
computation time from J vector multiplications to 1 vector multiplication. The resulting 
vector is then multiplied with e 1 '(k). The total time to compute each element of Rf(k) 
reduces from J + 1 complex vector multiplications to two complex vector multiplications. 
A block diagram of the scheme is shown in Figure 4.19. 

Let the number of floating-point operations required to estimate weights with this 
scheme be denoted by O^. The solution, then, is 

0 D2 = 16L 2 J + 8L 3 + 8J 2 L + 8L 2 + 20L (4.7.20) 
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FIGURE 4.19 

Block diagram of Parallel Processing Scheme Number 2. (From Godara, L.C. and Jahromi, M.R.S., IEEE Trans. 
Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 

4.7.4.2.3 Parallel Processing Scheme 3 

The FDM requires estimation of L 2 elements of matrix Rf(k). This scheme estimates these 
elements in parallel, as shown in Figure 4.20. Thus, by computing Rffk),,, 1, i = 1, ..., L in 
parallel, it saves time of the order of L 2 in the matrix estimation. Let the floating-point 
operations required to estimate weights using this scheme be denoted by O IB . Then 

0 D3 = 8J 2 + 8J + 8L 3 + 8J 2 L + 8L 2 + 20L (4.7.21) 

It should be noted that this scheme incorporates the processing of all frequency bins in 
parallel but does not carry out the multiplications of e(k) with (RJ in parallel, as suggested 
by Scheme 2. However, it is possible to carry out these operations in parallel by combining 
all of the above schemes to get the maximum speed for real-time operations. The floating¬ 
point operations required to estimate the weights using the combined scheme are given 
by the following expression: 


O dc = 16J + 8L 3 + 8J 2 L + 8L 2 + 20L (4.7.22) 

Figure 4.21 compares the ratios of floating-point operations required to estimate the 
optimal weights using the TDM to the FDM using various parallel processing schemes. 
Figure 4.21 (a) shows the results for an array with 100 elements as a function of filter length. 

Figure 4.21(b) shows the floating-point operations ratio as a function of the number of 
elements using 100 taps. The successive parallel processing schemes require less processing 
time, and thus, a substantial increase in computation speed is possible using them. Using 
a 100-element array with a filter length of 100 taps, a 50-fold computational savings is 
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FIGURE 4.20 

Block diagram of Parallel Processing Scheme Number 3. (From Godara, L.C. and Jahromi, M.R.S., IEEE Trans. 
Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 
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FIGURE 4.21 

Ratio of the required floating point operations using the time domain method to DFT method (O t /O d ) (a) vs. 
number of taps for 100 elements, (b) vs. number of elements for 100 taps. Curve A: Parallel Processing Scheme 
Number 1; Curve B: Parallel Processing Scheme Number 2; Curve C: Parallel Processing Scheme Number 3; 
Curve D: combination of all parallel processing schemes. (From Godara, L.C. and Jahromi, M.R.S., IEEE Trans. 
Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 


possible without any parallel processing, and 125,620-fold using all parallel processing 
schemes, compared to the TDM. 

It should be noted that the schemes discussed in this section to increase processing speed 
tend to do so by increasing system complexity. The limited bin-processing scheme not 
only reduces the computation requirements of the DFT method but also has a potential 
to improve its performance without increasing system complexity, as is the case with 
parallel processing schemes. 
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FIGURE 4.22 

Output SNR vs. interference angle with number of taps = 17, signal power = 1.0, interference power = 10.0, and 
white noise power = 0.1. Solid line depicts the result using the time domain method with 10 elements and dotted 
line depicts the result using the DFT method with 20 elements, (a) Large bandwidth, (b) Small bandwidth. (From 
Godara, L.C. and Jahromi, M.R.S., IEEE Trans. Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 


4.7.5 Discussion 

It follows from the results presented so far that the DFT method is computationally more 
efficient than the TDM, the output SNR of the beamformer is lower when the weights are 
estimated by the DFT method compared to the case when the weights are estimated by 
the TDM, and the interference-canceling capability of the DFT method decreases more 
than the TDM when the interference approaches the look direction. Some of these issues 
are reexamined in this section with the view to show that by appropriate choice of filter 
length and number of elements in the array, it is possible to achieve a higher output SNR 
with less processing time using the DFT method than the TDM. The DFT method is also 
robust against look direction errors. 

4.7.5.1 Higher SNR with Less Processing Time 

It is possible to obtain better SNR using the DFT method by increasing the number of 
elements or filter length such that the required processing time remains less than when 
using the TDM. Two examples are presented to demonstrate this fact. 

In the first example, an array with 20 elements uses the DFT method and an array with 
10 elements uses the TDM to estimate the weights. Performance of the two methods is 
compared in Figure 4.22, where results are displayed for both small and large bandwidth 
cases. The figure shows that the DFT method yields better performance than the TDM. 
For this case, computational savings of 14% are possible without using any parallel 
processing, and when using combined parallel processing, 99%. 

The second example uses a five-element array and a filter of 17 taps for the TDM and 
177 taps for the DFT method. Results for both bandwidth sources displayed in Figure 4.23 
indicate that the DFT method performance is almost equal to that of the TDM. The DFT 
method for this case requires about 21% less computation time than the TDM. It should 
be noted that the computational savings have been achieved by using parallel processing, 
which increases hardware cost. Reduction in hardware cost could be achieved by using 
the bin elimination method, which reduces 89 parallel stages to 11 stages for the small 
bandwidth case and to 65 parallel stages for the large bandwidth case. 
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(a) 


FIGURE 4.23 



(b) 


Output SNR vs. interference angle with number of elements = 5, signal power = 1.0, interference power = 10.0, 
and white noise power = 0.1. Solid line depicts the result using the time domain method with 17 taps and dotted 
line depicts the result using the DFT method with 177 taps, (a) Large bandwidth, (b) Small bandwidth. (From 
Godara, L.C. and Jahromi, M.R.S., IEEE Trans. Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 



FIGURE 4.24 

Output SNR vs. the look direction error with number of elements = 10, number of taps = 15, bandwidth [0.15,0.5], 
signal power = 1.0, interference power = 100.0, interference direction = 75 degrees with the line of the array, and 
white noise power = 0.001. Solid line depicts the result using the time domain method with 17 taps and dotted 
line depicts the result using the DFT method with 177 taps. (From Godara, L.C. and Jahromi, M.R.S., IEEE Trans. 
Signal Process., 47, 2386-2395, 1999. ©IEEE. With permission.) 


4.7.5.2 Robustness of DFT Method 

Processor performance is compared when weights are estimated using the two methods 
in the presence of the look direction error (LDE). It is assumed that the actual signal 
direction is different from the look direction. The weights in both cases are constrained 
in the look direction. 

Figure 4.24 shows the output SNR of the processor using the two methods as a function 
of look direction error. The error is measured relative to the look direction and is assumed 
positive in the counterclockwise direction. Thus, errors 1° and -1° mean that the signal 
direction is, respectively, 91° and 89° relative to the line of the array. 
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Figure 4.24 shows that the DFT method is robust against the look direction error in the 
presence of a single interference. The SNR of the processor using the DFT is about 10 dB 
more than the one using the TDM in the presence of look direction error of less than 0.5°. 
Although computer simulation shows the robustness of the DFT method against pointing 
error, a theoretical explanation does not seem to exist. 


4.8 Performance 

Performance of broadband arrays as a function of the number of various parameters such 
as the number of taps, tap spacing, array geometry, array aperture, and signal bandwidth 
has been considered in the literature [May81, Voo92, Com88, Ko81, Ko87, Nun83, Yeh87, 
Sco83] to understand their influence on the behavior of arrays. An analysis [May81] of 
broadband array using eigenvalues of the array correlation matrix indicates that the 
product of the array aperture and fractional bandwidth (FBW) of the signal is an important 
parameter of the broadband array in determining its performance. The FBW is defined as 
the ratio of the bandwidth to the center frequency of the signal. The number of taps 
required on each element depends on this parameter as well as on the shape of the array, 
with more taps needed for a complex shape. A study [Voo92, Com88] of the SNR as a 
function of inter-tap spacing indicates that there is a range of spacing that yields close to 
maximum attainable SNR and depends on the FBW of the signal. This range includes 
quarter wavelength spacing at the center frequency f 0 . The quarter wavelength spacing 
produces a 90° phase shift at f 0 and is equal to l/4f 0 . By measuring the tap spacing as a 
multiple of this delay, the inter-tap spacing with the multiple around 1/FBW yields close 
to the highest attainable SNR. With the multiple between 1/FBW to 4/FBW, a larger 
number of taps for an equivalent performance is necessary. 

A study of the jamming rejection capability [Ko81] and tracking performance of the 
array in nonstationary environment [Ko87] also indicates that when tap spacing is mea¬ 
sured in terms of the signal's center frequency, the best performance is achieved when the 
spacing is l/4f 0 . For this tap spacing, the array correlation matrix has less eigenvalue 
spread, which is the reason for this performance. The eigenvalue spread of a matrix 
indicates the range of values that its eigenvalues take. A bigger ratio of the largest eigen¬ 
value to the smallest eigenvalue indicates a larger spread. 

The TDL filter tends to increase the degrees of freedom of the array that may be traded 
against the number of elements such that an array with L elements is able to suppress 
more than L - 1 directional interferences provided that their center frequencies are not the 
same and fall within the FBW of the signal [Yeh87]. 
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Notation and Abbreviations 


(4>,e) 

W 

(^o/Qo) 

Af 

[Mr] 

a(f,k) 

DFT 

FDM 

FIR 

LDE 

MMSE 

MSE 

TDL 

TDM 

A(f,(t>,0) 

B 

B 

B s 

C 

C k 

D 

diag[x] 

E(t) 

e (t) 

e(k) 

e(f) 

F 

f 

fk 

4 

4 

G 

g 

H, H(f,(p,0) 
h(k) 

hi(k) 

J 

J 


direction in three-dimensional coordinate system. Figure 2.2 
direction with respect to line array 
look direction 

bandwidth of frequency bin 
frequency band of interest 
scalar defined in (4.6.33) 
discrete Fourier transform 
frequency domain method 
finite impulse response 
look direction error 
minimum mean square error 
mean square error 
tapped delay line 
time domain method 

desired frequency response in direction (cf),0) 
matrix prefilter 

matrix with B as diagonal elements 
normalized signal bandwidth 
LJ x J dimensional constraint matrix 
constraint matrix 
constraint matrix 

matrix with x as diagonal elements 

column of matrix prefilter outputs across TDL structure 

column of L - 1 outputs of matrix prefilter 

column of N phasers at kth bin defined in (4.6.25) 

column of N phasers at frequency f defined in (4.6.30) 

optimal weights with point constraints, only white noise present 

J-dimensional constraint vector 

kth component of f 

kth coefficient of inverse DFT of Nf nv m = 1, 2, ..., N 
sampling frequency 

optimal weights with directional constraints, only white noise present 
constraint vector 

frequency response of TDL processor in direction ((]),0) 

L weights of narrowband processor for kth bin 
weight on 1th channel for kth bin 
number of taps in tapped delay line filter 

number of bins that need processing in bin elimination method 
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J 

J(W) 

J(VA) 

L 

M 

M 0 

N 

o T 

Od 

P 

P(W) 

Ps(W) 

Pn(W) 

P(k) 

P 

P 

Q 

R 

Ps 

Rn 

R, 

(Run) 

(R,,i) 

Rf(k) 

R f (k) 

Rxe 

Ree 

*d 

S(f) 

s(f,<t», 0 ) 

S(fA0) 

s(t) 

SNR 

SNR(W) 

snr t 

snr d 

T 

T(f) 

P^O/Qq) 

T 0 

U 


number of bins that need processing due to DFT properties 

cost function 

cost function 

number of elements 

number of directional sources 

number of data sets of N samples 

Number of samples processed by frequency domain method 
Number of floating-point operations using TDM 
Number of floating-point operations using DFT method 
projection operator 

mean out power of a processor for given W 

mean output signal power for given W 

mean output noise power for given weight 

mean out power of narrowband processor for kth bin 

mean output power of TDL processor using optimal weights 

LJ-dimensional column vector defined by (4.1.68) 

LJ x LJ matrix defined by (4.1.67) 

array correlation matrix 

array correlation matrix due to signal source 

array correlation matrix due to noise 

array correlation matrix due to 1th source in direction ((^G,) 

matrix denoting correlation after (m - 1) and (n - 1) delays 

matrix denoting correlation between 1th and ith elements 

array correlation matrix in frequency domain for kth bin 

array correlation matrix using energy from kth bin only 

matrix of correlation between X(t) and E(t) 

matrix of correlation between E(t) and E(t) 

correlation between desired signal and array signal vector 

power spectral density of s(t) 

steering vector at frequency f in direction (<j),0) 

steering vector in (<|),0) direction for array presteered in ((]) 0 ,G 0 ) 

signal induced on reference element 

signal-to-noise ratio 

SNR for given W 

SNR using TDM 

SNR using DFT method 

inter-tap spacing, sampling interval 

diagonal matrix of steering delays 

steering delay on 1th element 

bulk delay to make T^^Gg) a positive quantity 

LJ x LJ matrix of the eigenvector of Q 


© 2004 by CRC Press LLC 



U tl , l matrix of eigenvectors associated with i] 0 nonzero eigenvalues of Q 

U; eigenvector associated with ith eigenvalue of Q 

V error vector, column of (L - 1)J weights of TDL structure 

V (L - 1)J dimensional optimal weights of TDL structure 

v k column of L - 1 weights on the kth tap of TDL structure 

W column of LJ weights of TDL structure 

W|, column of LJ fixed weight 

W optimal weights of TDL processor 

Vy, optimal weights of constrained partioned processor 

W weight vector which minimizes e 0 

W(n) weights estimated at the nth iteration 

w m column of L weights on the mth tap of TDL structure 

w lk weight on the kth tap of the 1th channel 

X(t) column of array signals across the TDL structure 

X(n) array signals at nth instant of time 

x(t) column of array signals after presteering delays 

x(k) column of frequency domain array signals for kth bin 

x(k,m) array signals for kth bin from mth data set 

x,(t) output of 1th sensor presteered in (t)>o,0o) 

x u output of 1th sensor before ith tap 

x,j(m) output of 1th sensor before ith tap from mth data set 

x,(m) N outputs of 1th sensor across TDL filter from mth data set 

x, components of the 1th element along x-axis 

x,(k) output of 1th sensor for kth bin 

x,(k,m) output of 1th sensor from mth data set for kth bin 

y(t) output of processor 

V| components of 1th element along y-axis 

y(n) output at nth instant of time 

y(k) output of processor at kth bin 

y(k,m) output of processor at kth bin from mth data set 

y A (t) output of auxiliary beams 

y F (t) output of fixed beam 

z, components of 1th element along z-axis 

A diagonal matrix with elements being eigenvalues of Q 

A^G) diagonal matrix defined by (4.3.5) 

A e ((|) / 0) diagonal matrix defined by (4.3.17) 

A^((])) diagonal matrix defined by (4.3.51) 

A sampling interval 

Ho rank of Q 

8 0 threshold value 

g 0 normalizing constant 
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o-(<t>) 

«IK4>) 

e o 

X 

HQ) 

\ 

\(n) 

m 

pW 

p, p(f,<i>,e) 

Po 

xi(^e) 

h(<l>) 


vector defined by (4.3.55) 
vector defined by (4.3.56) 

MSE between A(f,(|) O/ 0 o ) and E^f^Go) 

Lagrange multiplier 
ith eigenvalue of Q 

J-dimensional vector of undetermined Lagrange multipliers 
Lagrange multipliers at nth iteration 
vector defined by (4.3.54) 
correlation function of s(t) 

power response of TDL processor in direction ((]),0) 
correlation between desired signal and array output 
delay faced by signal from source in ((]),0) on 1th element 
delay faced by signal from source in ((])) on 1th element 
delay parameter 
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Interference canceling capabilities of the optimal antenna array processors discussed in 
previous chapters assumed implicitly or explicitly that the desired signals arriving from 
the look direction and the nonlook directional interferences are not correlated. Correlation 
between the desired signal and unwanted interference exists in situations of multipath 
arrivals and deliberate jamming, and affects the performance of antenna array processors 
as discussed in [Wid82, God90, Tak87, Sha85, Han86, Han88, Lut86, Red87, Zol88, Ali92, 
Qia95, Cho87, Tak86, Wil88, Han92], 

The two directional signals are said to be fully correlated or coherent when one is the 
delayed and scaled version of the other. For two sinusoidal signals, this amounts to the 
fixed-phase relation between the two. The coherence between two signals normally arises 
from deliberate jamming using so-called smart jammers, whereas the multipath signals 
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normally result in partial correlation. The study of antenna array systems presented in 
this chapter includes general correlated fields with coherence as a special case. Unless 
otherwise explicitly stated, only two directional sources are assumed to be present to 
facilitate the derivation of analytical expressions for the performance measure of antenna 
array processors. 

Correlation between the desired signal and an interference limits the applicability of 
various weight estimation schemes. For example, when the weights are estimated by 
minimizing the mean output power subject to the look direction constraint, the processor 
cancels the desired signal while maintaining the constraint. The reason this happens is 
that the processor, while minimizing the mean output power, adjusts the phase of the 
correlated interference induced on each antenna such that the power of the sum of the 
signal and the interference that is correlated with the signal is minimized, causing the signal 
cancelation. This is consistent with the design that the processor minimizes the output 
power. The optimal weights design is based on the assumption that the signal is not 
correlated with interference. 

The correlation 5 xy (f) between two broadband signals x(t) and y(t) is defined in terms 
of their power spectrum [Car87]: 


M f ) 


G xy (f) 

GjOCjf) 


(5.1) 


with G^(f) denoting the cross-power spectrum. It is related to the cross-correlation function 


Px y (x)=E[x(t)y(t + x)] 


(5.2) 


by the inverse Fourier transform 


G xy (f) = }p xy (x)e iM Mx 


(5.3) 


This chapter shows that the correlation between the desired signal and the unwanted 
interference severely degrades the performance of antenna array systems, and techniques 
are presented to improve their performance by decorrelating the directional sources. Both 
narrowband and broadband arrays are discussed. 


5.1 Correlated Signal Model 

Consider an array of L omnidirectional elements immersed in the far field of two sinuso¬ 
idal sources. One source is a signal source and the second is interference. Let p s and p t 
represent the powers of the signal source and the interference, respectively; and let o ,7 
denote the variance of the random noise component on each element with the temporal 
narrowband spectrum and spatially white spectrum. 

Let an L dimensional vector x(t) represent the L wave forms derived from L elements 
of the array, and let a complex scalar 5, which lies within the unit circle, represent the 
correlation coefficient between the two sources. Assuming the center of the coordinate 
system as the time reference, the vector x(t) can be expressed as 
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x (t) = VPs^WSfl + vPi( 8 * m sW + v' 1 -+ n(t) 


(5.1.1) 


where S 0 and S , are steering vectors in the signal direction and in the interference direction, 
respectively; * denotes the complex conjugate; n(t) represents the random noise compo¬ 
nent; and m s (t) and m,(t) are zero-mean, unit-variance, complex low-pass processes asso¬ 
ciated, respectively, with the signal source and the interference source. It is assumed that 
m s (t), rryft), and n,(t) are mutually uncorrelated with n,(t) denoting the 1th component of 
n (t). 

The value of the complex scalar 8 decides the correlated field under consideration. When 
8 lies on the unit circle, that is |8| = 1, the two sources are coherent and their fixed-phase 
difference is given by 8 p , the phase of 8. On the other hand, when it lies inside the unit 
circle with |8| < 1, the two sources are partially correlated and 8 = 0 corresponds to the 
uncorrelated field case. For the uncorrelated field, (5.1.1) becomes 


X W = '\Ps m s( t ) S o + + n (t) (5.1.2) 

Equation (5.1.2) is identical to (2.1.15) with M = 2, mp A /py ; m s (t), m 2 = A /fT I m I (t), S, = S 0 , 
and S 2 = Si- 

From (5.1.1) it follows that the array correlation matrix R can be expressed as 


R = E[x(t)x H (t)] 
= ASA h + oj;I 


where L x 2 dimensional matrix 


A = [S 0 ,Sj] 


(5.1.3) 


(5.1.4) 


and 2x2 dimensional source correlation matrix 


Ps_ VPsPi 8 

VPsPi 8 * Pi 


(5.1.5) 


Note that (5.1.4) is identical to (2.1.27) with M = 2. However, the source correlation matrix 
S for the correlated case given by (5.1.5) differs from the uncorrelated case given by (2.1.28) 
due to the presence of off-diagonal terms containing the correlation coefficient. 

The above equations show how the correlation between the two sources affects R. It 
follows from these expressions that when two sources are uncorrelated, that is |8| = 0, S 
is a diagonal matrix guaranteeing R to be positive definite (assuming A is of full rank, 
which requires that steering vectors corresponding to all directional sources are linearly 
independent [God81]). The presence of correlation affects the rank of S and thus of R. In 
the presence of correlation, the matrix R becomes ill conditioned and may not be invertible, 
making it difficult for estimation of the weights of the optimal beamformer, which relies 
on existence of the inverse of R. Thus, a beamforming scheme, which is optimal in the 
absence of correlated arrival, is not able to cancel a correlated interference. 

In the next section, the behavior of the constrained element space processor (ESP) 
discussed in Section 2.4 is analyzed. 
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5.2 Optimal Element Space Processor 

Consider the narrowband ESP shown in Figure 2.1. The output y(t) and the mean output 
power P(w) of the processor for the given weights w are given by 

y(t)=w H x(t) (5.2.1) 

and 

P(w) = w H Rw (5.2.2) 

Let w represent the L weights of the processor that minimizes the mean output power 
subject to unity constraint in the look direction, that is, 

minimize w H Rw 

w (5.2.3) 

subject to w H S 0 = 1 

The processor with these weights is referred to as the optimal processor in Chapter 2. 
An expression for the mean output power of the optimal processor in the presence of 
correlated arrival is derived [Red87] below. Substituting for R from (5.1.3) to (5.1.5) in 
(5.2.2) it follows that 


P(w) = [w H S 0 , w H S,] 


Ps 

VPsP^ 8 

1 

S 

►C o 

C/5 

i_ 

VPsPi 5 * 

Pi 

S|'w 


+ g 2 w h w 


= p s w H S 0 S»w+ Pl w H S,Sf w + vPsPi § w H S 0 S[ i w 
+ vPsPi 5 * w H S,S»w + o> H w 


(5.2.4) 


To solve beamforming problem (5.2.3) using the Lagrange multiplier method define a 
cost function. 


J(w) = |p(w) + ^(w H S 0 -l) (5.2.5) 

where X is the Lagrange multiplier. The solution w is obtained by setting the partial 
differentiation of the cost function with respect to w equal to zero. Thus, 


dj(w) 

3w 


= 0 


Substituting for J(w) in (5.2.6) and using (5.2.4), 


(5.2.6) 
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Define 


~~ {PsSoSo +PiSiSj + vPsPi 8 S 0 S, + Jp s Pi 8*S,S 0 +cr n J 


fw 


(5.2.7) 


c H s 

P = ^ (5-2.8) 

and note that 

w H S 0 = 1 (5.2.9) 

S“S 0 = L (5.2.10) 

and 

Si'S, = L (5.2.11) 

Premultiplying (5.2.7) by S^, and using (5.2.8) to (5.2.11) one obtains 

* = “Ps “ s r^(PrP + VPsPi 8 ) - VPsPi 8 * P - ^7 ( 5 . 2 . 12 ) 

Substituting for X in (5.2.7), premultiplying it with S 1 ] 1 , using (5.2.8) to (5.2.11), and solving 
for w H Sj, one obtains the optimized processor response in the interference direction, that is. 


,;,Hc _ a nP-yP S Pl 5L P 

W - - 7) - 

°n +Pl L P 


where 


p = l 


cHc cHc 
J 0 J I J I J 0 


Note that p is also given by 


P = 1-PP* 


(5.2.13) 


(5.2.14) 


(5.2.15) 


Equation (5.2.13) describes the response of the beamformer in the interference direction. 
Consider oy = 0. Equation (5.2.13) for this case becomes 


w H S, = 


- ^8 

V Pi 


(5.2.16) 
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It follows from (5.2.16) that the response of the optimal processor in the absence of white 
noise is not zero, as is the case for the uncorrelated sources case. Thus, it does not cancel 
the interference when it is correlated with the desired signal. 

The implication of (5.2.16) is that even though the response of the processor in the look 
direction is unity (w H S! = 1), the processor using weights w suppresses the look direction 
signal. This aspect is now examined by deriving an expression for the mean output power 
of the optimal processor. 

It follows from (5.2.4) and the fact that an expression for w H S t has been derived previ¬ 
ously, that to evaluate P(w), only an expression for w H w is necessary. This can be obtained 
by premultiplying (5.2.7) with w H and using (5.2.8) to (5.2.11). Thus, 

G n 2 w H w = -A - p s - p,w H S I S[ 1 w - ^/pgP] SS^w + x /p s p t 5* w"S, (5.2.17) 

where A is given by (5.2.12) and w H S t is given by (5.2.13). Substituting for o jjw H w from 
(5.2.17), A from (5.2.12), and w H S x from (5.2.13) in (5.2.4), 

P = w h Rw 

<J„ 2 -p s pi l s f L P + ( 1 -P) p i 0 n + ^PsPi(P 5 '+ P* s )°n (5.2.18) 

~ Ps L P]Lp + aj) 

Now consider o 2 = 0. For this case 


P = Ps(l-|5| 2 ) 


(5.2.19) 


It follows from (5.2.19) that the mean output power of the processor decreases as the 
magnitude of the correlation constant increases and reduces to zero for coherent sources. 
The processor in this case completely cancels the desired signal. 

In the next section, the optimized postbeamformer interference canceler (PIC) processor is 
studied in a correlated field environment. It is shown that the performance of the optimized 
PIC processor is identical to that of the optimal ESP [God90]. Using this fact, a derivation of 
the output signal-to-noise ratio (SNR) of the optimal ESP is presented in Section 5.4. 


5.3 Optimized Postbeamformer Interference Canceler Processor 

The narrowband PIC processor structure is shown in Figure 5.7 . Section 2.6.3 shows that 
the mean output power of the processor for a given weight w is given by 

P(w) = V h RV + w* wU h RU - w* V h RU - wU h RV (5.3.1) 

where L dimensional complex vectors V and U, respectively, denote the fixed weights of 
the signal beam and the interference beam, and a complex scalar w denotes the adjustable 
weight. The optimal weight w, which minimizes the mean output power P(w), is given 
by (2.6.48), that is. 
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(5.3.2) 


. v h rv 

W = —n- 

u h ru 

Assume that the signal beam is formed using the conventional beam forming weights, 
that is, 

V = ^-S 0 (5.3.3) 

and that the interference beam is selected as follows: 


U = PSj (5.3.4) 

where P is a projection matrix given by 

S S H 

P = I—Li- (5.3.5) 

Equation (5.3.3) ensures that the signal beam response in the look direction is unity. The 
interference beam selected using (5.3.4) and (5.3.5) has a unity response in the interference 
direction and has a null in the look direction. This form of interference beam has been 
selected to facilitate the derivation of the output SNR for the optimized ESP. 

Next, an expression for P(w), the mean output power of the optimized PIC, is derived, 
and it is shown that P(w) is equal to P, the mean output power of the optimal ESP in the 
presence of correlated sources. It follows from (5.3.1) and (5.3.2) that the mean output 
power P(w) of the optimal PIC is given by 


P(w) = V h RV 


U h RVV h RU 

U h RU 


(5.3.6) 


Substituting for V and U from (5.3.3) and (5.3.4) in (5.3.6) results in an expression for P(w). 
This is achieved by evaluating V H RV, V H RU, U H RV, and U H RU, and substituting in (5.3.6). 
It follows from (5.3.3), (5.3.4), and (5.3.5) that 


V H U 


S o'PS, 

L 


= 0 


U H U = SfPS : 

= Lp 

These, along with (5.1.3), imply that 

v h ru = v h asa h u 

2 

V h RV = V h ASA h V + 

L 


(5.3.7) 


(5.3.8) 


(5.3.9) 

(5.3.10) 
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and 


U h RU = U h ASA h U + Lpa 2 
From (5.3.3), (5.1.4), and (5.2.8), 


Similarly, 


Thus, 


V h A = [l p] 


A h U = 


S"PS, 

SfPSj 


0 

Lp 


v h ru = [i p] 


Ps 

v'PsPi 8 * 


VPsPi 8 

' 0 ' 

Pi 

L P_ 


= a/PsPi SLp + PiPLp 


(5.3.11) 


(5.3.12) 


(5.3.13) 


(5.3.14) 


V h RU = [1 p] 


Ps 

VPsPi 8 * 


v PsPi 5 
Pi 


= Ps + 1 1 - p)Pi + vPsPi (P 8 * +P* 5 ) + 


(5.3.15) 


and 


U h RU = [0 Lp] 


Ps 

VPsPi 5 

' 0 ' 

.VPsPi 8 * 

Pi 

L P_ 


+ Lpo n 2 


= L 2 p 2 p t + Lpa 2 


It follows from (5.3.14) along with (5.2.15) and (5.3.16) that 

U h RVV h RU _ p s p : #L 2 p + p 2 (l-p)pL 2 + A /p^L 2 p I p(p5 >f +p >t 5) 


U h RU 


L-ppi + Lo n 

Subtracting (5.3.17) from (5.3.15), the following expression for P(w) results: 


p(w) = V | °n ! ~PsPll 5 l L P + ( 1 -p)Pi a n+VPsPi (P S 5 f +P* S K 
1 ’ Ps L pjLp + a 2 


(5.3.16) 


(5.3.17) 


(5.3.18) 
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Comparing (5.2.18) and (5.3.18), shows that the mean output powers of the optimal ESP 
and the optimal PIC with V and U selected using (5.3.3) and (5.3.5), respectively, are the 
same. Thus, in the presence of the correlated sources, the two processors perform identically. 

In the next section, an expression for the output SNR of the two processors is derived 
[God90]. As the performance of the two processors is the same, the PIC processor is used 
for derivation of results. 


5.4 Signal-to-Noise Ratio Performance 

For the ease of analysis, rewrite (5.1.1) by regrouping terms containing m s (t) as follows: 


x(t) = m s (t)(yp s S 0 + 8* y'pjS^ + m^t)^, ^HSfs, +n(t) (5.4.1) 

From (5.1.3) and (5.4.1), 

R = [p s S 0 S» + KfpjSjS? + vPsP [ (SS 0 S; 1 + 8* S : S«)] + p,(l- 5 2 )s,Si' + o n 2 I (5.4.2) 

It should be noted that the array correlation matrix is composed of three terms. The first 
term in square brackets is contributed by the signal source. Let it be denoted by R s . The 
second and the third terms on the RHS of (5.4.2) are contributions due to the interference 
source (the component that is uncorrelated with the signal source) and the random noise. 
Let these be denoted by R, and R n , respectively. 

It follows from (5.3.1) that the output signal power P s (w), residual interference power 
P[(w), and the output uncorrelated noise power P n (w) of the optimal PIC, respectively are 
given by 


P s (w) = V h R s V + w*wU h R s U - w *V h R s U - wU h R s V (5.4.3) 

Pj(w) = V h RjV + w*wU h RjU - w *V h R,U - wU h R,V (5.4.4) 

and 

P (w) = V h R V + w*wU h R U-w^R U-wU h R V (5.4.5) 

nv / n n n n v ' 

Let P N (w) denote the total noise at the optimal PIC output. This consists of output 
uncorrelated noise power and residual interference power, that is, 

P N (w) = P I (w) + P n (w) (5.4.6) 

First, consider P s (w) and evaluate various terms on the RHS of (5.4.3). It follows from 
(5.3.2), (5.3.14), and (5.3.16) that 


c VPsPi 8 + PiP 

w —-~— 

Lppi+On 


(5.4.7) 
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and 


w^w = 


PsPil 5 ! +P?( 1 -p) + PiVPsP^( 8 *P + 8 ^) 

(Lppi + o n 2 ) 


(5.4.8) 


Substituting for R s from the first term on the RHS of (5.4.2) and using (5.3.3) to (5.3.5), 

(5.4.9) 


V h R s V = p s +15| Pl (l - p) + VPsP^P* +«*P) 


2„ T 2„2 


U R S U = |8| pjL p 


and 


U H R s V = @f Pl LpP*+Vp^S*Lp 
It follows from (5.4.7) and (5.4.11) that 


wU H R c V = 


LpPi p s |5| 2 +p,(l- p)|5| 2 + A /p s p, |5| 2 p H ‘5+ p.ppV'f. 


Lppi+o n 

Substituting for R s from (5.4.8) to (5.4.12) in (5.4.3), 

p s(w) = p s + |S| 2 Pj(1 - p) + VPsP7(SP" + 8*p) 

i PsPi|8| 4p2 p 2 +Pi( 1 ~p)|8| 2 L 2 p 2 +pf|8| 2 L 2 p 2 A /psPi(8p*+ 8*p) 
( L PPl+°n ) 2 

2Lppjp s |8 “ + 2Lpp 2 (1- p)|8| 2 

Lppi + a n 2 

l PPiVPsPi | 8 | 2 ( 8 P* + §*P) + L PPi vPsPi f(«P*+ §*P) 


= Ps 


1 + 


( Pi|5| 2 Lp ^ 

Lppi+o n 2 


Lppi + o,; 

2 L PPi | 8| 2 

Lppi+o 2 


+Pi(i-pM 


1 + 


Pi L ¥ 


2 Lppi 


(Lppj + o 2 )" L PPl+°n 


\PsPi (SP*+S*P) 


p 2 |S| 2 L 2 p 2 LppjjSp + Lppj 


1 + 


(Lppj + a 2 ) L PPi + °n 


(5.4.10) 


(5.4.11) 


(5.4.12) 


(5.4.13) 
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After manipulation, (5.4.13) leads to 


p s(w) = p s 


1- 


|5| Lppj 


Lppi + o n 

,/PsPi( 8 P* +8 *P) 


+Pi(!-p)i 8 r 


Lppi + o n - 


Lpp.+o; 


2 'N 


1- 


LpPi|5| 
Lppi + o n 2 


(5.4.14) 


Next, an expression for the residual interference at the output of the optimal PIC is 
derived. Substituting from (5.3.3), (5.3.4), and the second term on the RHS of (5.4.2) for 
Rj in (5.4.4), results in 


P,(w) = Pj(l -15| 2 j[l - p + w*wL 2 p 2 - w*[5Lp - wfPLp] 
which along with (5.4.7) and (5.4.8) implies that 


(5.4.15) 


p i(w) = p : (l-|5| 2 ) 


i PsPi I 5 ! 2 L 2 P 2 + P? (1 - P)L 2 P 2 

P (LPP, + <f 


Pi N^PsPi L 2 p 2 ( 5>e p + S(3*) 

(Lppi + o 2 ) 2 

LpVPsPi( 8if P + 8 r) + 2L PPl (l- P ) 

Lppi + o 2 P PPl + °n 


: Pi(H 8 l 2 ) (!-p) 


1 + 


' Pi P P ^ 

Lppi+o 2 


2Lpp, 

Lppi+o 2 


PsP^Sflrp 2 | LpyPsP^P + Sp*) 

(Lppj + a 2 ) 2 L PPi + °n 


LpPi 


Lppi + o; 


-1 


After manipulation, (5.4.16) leads to 


P i(w) = P I ( 1 -| 5 | 2 ) 


<h 2 {(! ~ pK ~ Lpy'PsPi ( 8 *P + 8 P1} + PsPiI 8 | 2 l 2 P 2 

(Lppi + o 2 ) 2 


(5.4.16) 


(5.4.17) 


Similarly, an expression for the uncorrelated noise power at the optimal PIC output is 
given by 


P n(w)=^ + L PPl o 2 


Ps| 5 | 2 + Pi(1-p) + VPsP7( 5 -(3 + 5 - P ) 
(Lppi + o 2 ) 


(5.4.18) 
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Substituting from (5.4.17) and (5.4.18) in (5.4.6), after manipulation 


_ (5.4.19) 

, prVPsPiLpPi^P + ^-prPia-PK ,2 

(Lppi + a n 2 ) 

It can easily be verified that for |S| = 0, (5.4.19) reduces to (2.7.72). 

Let SNR(w) denote the output SNR of the optimal PIC defined as 

SNR(w) = (5.4.20) 

1 ’ Pjw) 

Substituting for P s (w) and P K (w) from (5.4.14) and (5.4.19) in (5.4.20), 

snr(w) = , ^ 7 ~ |5|2 ) i + P-8)y(n Y -N 1 ) (5421) 

yH —+y |(i+ y)+|s| 2 [p s (l+ y - |sf)+y vPsP^l 5 ^! 3 +P 5 " 5 ) - PiC 1 - p)y 2 
L vP 

where 

2 

y = -a- (5.4.22) 

LpPi 

As discussed in the previous section, the optimal PIC and ESP behave identically in the 
absence of errors. Thus, the expression for the output SNR given by (5.4.21) is true for 
both processors. Let it be denoted by SNRO. In the following, some special cases are 
considered. 


5.4.1 Zero Uncorrelated Noise 

Zero uncorrelated noise corresponds to (y = 0. From (5.4.22) it follows that for this case, 
y = 0, which along with (5.4.21) implies that 

SNRO = —U- (5.4.23) 

|sf 

Thus, in the absence of uncorrelated noise, the output SNR is independent of the array 
geometry and noise environment. It only depends on the magnitude of the correlation 
coefficient and is independent of its phase. It should be noted here that y = 0 indirectly 
assumes that p, and p are not identical to zero. 


P n(w) = 


L Pl + °n" l 8 l Ps L PPlfl ifLpiP 


L Lpp, + o 2 Lppj + o 2 1 Lpp, + o 2 
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5.4.2 Strong Interference and Large Number of Elements 

Now, consider a case when there is a strong interference source in the presence of nonzero 
uncorrelated noise, and the array consists of a large number of elements, such that 


y -^0 


It follows from (5.4.21) and (5.4.24) that 


SNRO = 



(5.4.24) 


(5.4.25) 


Thus, the output SNR in this case is less than that for the zero uncorrelated noise case 
and decreases as the uncorrelated noise power increases. 


5.4.3 Coherent Sources 

This corresponds to |8| = 1. Substituting 8 = 1 in (5.4.21), the following expression results 
for the output SNR when the signal source and the interference source are fully correlated: 


where 


SNRO = 


r 

2 (Ps + Pi( 1 -p) + a/PsPi^) 

1 

a «\ 

L 

fl \ 

- + T ( 

Ip ) 

1 + y) + Ps y - p,(l - p)y 2 + y ^PsPi^ 


D. = Pe~' 5p + P*e' 8p 

with S p denoting the phase of the correlation coefficient. 
It follows from (5.2.15) that 


I’- 1 P e ;Sp 


where p p denoting the phase of p. Thus, (5.4.27) implies 

n = 2,/l-pcos(P p -5 p ) 


(5.4.26) 


(5.4.27) 


(5.4.28) 


(5.4.29) 


Substituting for y and £2 in (5.6.26) leads to the following expression for SNRO for fully 
correlated sources: 


SNRO = 



( - 2 \ 


V L J 



+ C 


(5.4.30) 
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with 


A = Ps +Pi( 1 -P)+ v'PsPil 1 - P) 2 cos (P P - 5 P ) (5.4.31) 

B = 2p,p (5.4.32) 

C = Pi + PsPiP + 2 PfP vPs ( 1 - P) cos (P P - 5 P ) (5A.32) 

It follows from (5.4.30) that for fully correlated sources, the output SNR (1) increases as 
o 2 /L increases for low values of o 2 /L, and (2) decreases as o 2 /L increases for high values 
of On/L. Furthermore, the output SNR attains the maximum value 

SNRO "« = 2^B (5A34) 

when 

G n 2 = WC (5.4.35) 


5.4.4 Examples and Discussion 

In this section, some examples are presented to understand the effect of correlation on the 
output SNR. For the results presented in Figure 5.1 to Figure 5.4, a linear array of one- 
half wavelength spacing is used. The signal source of unity power is assumed broadside 
to the array. Interference direction is measured relative to the line of the array. The 
correlation phase is measured at the center of the coordinate system, which is at an end 
element of the array. The correlation phase is assumed to be equal to 45°. 

Figure 5.1 shows the output SNR as a function of the uncorrelated noise power for 
various values of |S|. The curve with the solid line is for fully correlated sources and agrees 
with the results presented in the previous section. One observes from the figure that as 
|S| increases, the output SNR (1) decreases for low values of uncorrelated noise, and 
(2) increases for high values of uncorrelated noise. The reason for the increase in the output 
SNR as |S| increases in the presence of high uncorrelated noise power is that, for this 
scenario, the optimal processor tends to behave as the conventional processor. The 
response of the conventional processor in the direction of the interference is fixed, and 
thus the processor does not minimize the output power by canceling the desired signal. 

Figure 5.2 to Figure 5.4 show the output SNR as a function of |S| for various values of a 2 
Figure 5.2 is for an array with four elements whereas Figure 5.3 is for an array with ten 
elements. Comparing Figure 5.2 and Figure 5.3, it is apparent that for a given noise field, 
an increase in the number of elements in an array causes the output SNR to increase in 
the absence of correlation but has a reverse effect when the sources are fully correlated. 
Note the difference in the scales for the two figures. 

Figure 5.4 shows the results for an array with ten elements when interference is in 
direction 65°. A comparison between Figure 5.3 and Figure 5.4 reveals that the effect of 
correlation on the output SNR is more when the interference is far from the look direction. 
For the scenario of Figure 5.4, the output SNR decreases as |S| increases at all values of a 2 
However, the reduction at higher values of o 2 is much less than at lower values of a 2 . 
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FIGURE 5.1 

Output SNR vs. the uncorrelated noise power for a four element linear array with one-half wavelength spacing 
for various values of |S|, p, = 1, 0[ = 85°, p s = 1, 0 O = 90°, 8 p = 45°. (From Godara, L.C., IEEE Trans. Aconst. Speech 
Signal Process., 38, 1-15, 1990. ©IEEE. With permission.) 


5.5 Methods to Alleviate Correlation Effects 

Many beamforming schemes have been devised to cancel an interference source that is 
correlated with the signal. In principle, these work by restoring the rank of R. In this 
section, some of these are briefly reviewed [God97]. 

In some earlier work [Wid82, Gab80], a mechanical movement of the array perpendicular 
to the look direction was suggested to reduce the signal cancelation effect by the correlated 
interference. The scheme generally known as the spatial dither algorithm works on the 
principle that as the movement is perpendicular to the look direction, the signal induced 
in the array is not affected, whereas the interference that arrives from a direction different 
from that of the signal gets modulated with this motion. This causes a reduction in 
interference, as noted in [Cho87] where the dither algorithm is further developed such 
that a mechanical movement is not required. 

The spatial smoothing scheme [Eva81] uses a notion of spatial averaging by subdividing 
the array into smaller subarrays, and estimates the array correlation matrix by averaging 
the correlation matrices estimated from each such subarray. The use of spatial smoothing 
for beamforming is discussed in [Sha85, Red87] showing that the use of this method 
reduces effective correlation between the interference and desired signal resulting in 
reduced signal cancelation caused by optimal beamforming. Details on spatial smoothing 
are provided in Section 5.6. 
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FIGURE 5.2 

Output SNR vs. the magnitude of the correlation coefficient for a four-element linear array with one-half 
wavelength spacing for various values of aj, p, = 100, 8[ = 85°, p s = 1, 9 0 = 90°, 8 p = 45°. (From Godara, L.C., 
IEEE Trans. Acoust. Speech Signal Process., 38, 1-15, 1990. ©IEEE. With permission.) 

The spatial smoothing method uses uniform averaging of all matrices obtained from 
various subarrays, that is, each matrix is weighted equally. This results in an estimate of 
the matrix that is not as good as the one that could have been obtained from given subarray 
matrices. Ideally in the absence of correlation, the array correlation matrix for a uniformly 
spaced linear array has a Toeplitz structure, that is, elements of the matrix along each 
diagonal are equal, and the estimated matrix by the spatial smoothing scheme is not the 
closest to the Toeplitz matrix. An estimated matrix that is closest to a Toeplitz matrix is 
obtained by a spatial averaging technique [Tak87, Lim90]. This technique weighs each 
subarray matrix differently and then optimize the weights such that it minimizes the mean 
square error between the weighted matrix and a Toeplitz matrix. When this matrix is used 
to estimate the weights of the beamformer, the resulting system reduces more interference 
than that given by the uniform weighted matrix estimate. 

It should be noted that the number of rows and columns in the estimated matrix is equal 
to the number of elements in the subarray and not equal to the number of elements in 
the full array. Thus, the weights estimated by this matrix could only be applied to one of 
the subarrays. Consequently, not all array elements are used for beamforming. This 
reduces the array aperture and its degrees of freedom. For an environment consisting of 
M - 1 direction interferences and the desired signal, the subarray size should be at least 
M + 1 and the number of subarrays should be at least M(M - 1) + 1 [Tak87]. 

A scheme that does not reduce the degrees of freedom of the array is described in 
[God90]. It decorrelates the sources by structuring the correlation matrix as the Toeplitz 
type by averaging along each diagonal, and uses the resulting matrix to estimate the 
weights of the full array. An adaptive algorithm to estimate the weights of an array based 
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FIGURE 5.3 

Output SNR vs. the magnitude of the correlation coefficient for a ten-element linear array with one-half wave¬ 
length spacing for various values of p] = 100, 0! = 85°, p s = 1, 0 O = 90°, 8 p = 45°. (From Godara, L.C., IEEE 
Trans. Acoust. Speech Signal Process., 38, 1-15, 1990. ©IEEE. With permission.) 

on this principle is presented in [God91], and the concept is extended to broadband 
beamforming in [God92]. Details are provided in Section 5.7 and Section 5.8. 

A beamforming scheme [Wid82] based on master and slave concepts cancels the corre¬ 
lated arrival by the use of two channels. In one channel, the look direction signal is blocked, 
and then weights are estimated by solving the constrained beamforming problem. These 
weights are then used on the second channel. As the signal is not present at the time of 
weight estimation, the beamformer does not cancel the signal. However, the process only 
works for one correlated interference. It is extended for the multiple correlated interference 
case in [Lut86] where an array of 2M - 1 elements is required to cancel M -1 interferences. 

Other schemes that require some knowledge of the interference, such as direction or the 
correlation matrix due to interference only, are discussed in [Han86, Han88, Qia95, Wil88, 
Han92]. Many of the schemes discussed above improve the array performance in the 
presence of correlated arrivals by treating the correlated components as interferences and 
canceling them by forming nulls in their directions using beamforming techniques. These 
methods do not utilize the correlated components as is done in the diversity-combining 
techniques discussed in Chapter 7. In diversity combining, various components are added 
in a way to improve the signal level. 

The RAKE receiver [Vau88, Tur80, Pri58, Faw64] achieves this increase in signal level 
for a CDMA system by using a number of demodulators operating in parallel to track 
each component employing the user code for that signal. The signal delay is identified by 
sliding the code sequence to obtain the maximum correlation with the received component. 
The signals are added at the baseband after appropriate delay and amplitude scaling. The 
receiver, however, does not cancel unwanted interference by shaping the beam pattern. 
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FIGURE 5.4 

Output SNR vs. the magnitude of the correlation coefficient for a ten-element linear array with one-half wave¬ 
length spacing for various values of p, = 100, 8j = 65°, p s = 1, 0 O = 90°, 8 p = 45°. (From Godara, L.C., IEEE 
Trans. Acoust. Speech Signal Process., 38, 1-15, 1990. ©IEEE. With permission.) 
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FIGURE 5.5 

Construction of subarrays. 


5.6 Spatial Smoothing Method 

The spatial smoothing method, also known as the subarray averaging method, estimates 
the weights of an L-element antenna array system using an augmented array of more than 
L elements, and is suitable for a linear array of equispaced elements. The signals induced 
on these extra elements are only used to restore the rank of the array correlation matrix 
to be used in weight estimation. These signals are not used to produce the array output. 

The method divides the array in L 0 subarrays of size L such that the first subarray 
consists of Element 1 to Element L, the second one consists of Element 2 to Element L + l, 
and so on as shown in Figure 5.5. 
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Let the L dimensional vectors x^t), x 2 (t), ..., x^t) denote the array signal vectors of L 0 
subarrays, that is. 


x l(t)=[ x l(t)' x 2 (t)'- 

,.,x L (t)] T 

(5.6.1) 

x 2 (t) = [x 2 (t), X 3 (t),.. 

•' X L + l(t)] T 

(5.6.2) 

W = [ X L„ W' X L 0 +lO-)' • 

X L+L„-lW] 

(5.6.3) 


where x k (t) denotes the signal induced on the kth element of the augmented array (full 
array). 

Let R k denote the array correlation matrix of the kth subarray, that is, 

R k =E[x k (t)x»(t)] (5.6.4) 

Define the spatially smoothed correlation matrix R by averaging R k/ k = 1,2 ,..Lq, that is, 

1 L ° 

R = -£R k (5.6.5) 

0 k=l 

and use this to estimate the weights of the array system. It follows from (5.6.3) that to 
form L 0 subarrays of size L, one needs L + L 0 - 1 elements. 

As shown in [Sha85], the matrix R has full-rank iff L 0 > L - 1. Thus, to estimate an L x L 
dimensional full-rank spatially smoothed correlation matrix to estimate weights of an 
L-element array system, at least 2(L - 1) elements are necessary. 


5.6.1 Decorrelation Analysis 

In this section, an analysis is presented that shows the decorrelation effect of the spatial 
smoothing method [Red87]. It follows from (2.1.15) that the array signal vector x(t) due 
to M directional sources and white noise can be expressed in the matrix notation as 

x(t) = As(t) + n(t) (5.6.6) 

where the M dimensional vector s(t) is defined as 


s ( t ) = [ m i( t )^ m 2 (t),..., m M (t)] T (5.6.7) 

with m k (t) denoting the modulating function of the kth source and 

A = [s( 9 1 ),...,S(0 m )] (5.6.8) 

with 8(0^ denoting the steering vector associated with the kth source in direction 0 k , that is. 
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( 5 . 6 . 9 ) 


S ( 0 k) 


e j2jl%x 1 (0].) 


e j27i%T L (e k ) 


with 


T i( e k) = ( 1 - 1 )^ cose k 


(5.6.10) 


Now, consider the array signal vector for the kth subarray. Following (5.6.6) to (5.6.10), 
it can be expressed as 


x k( t ) = A k s ( t ) + n k( t ) 


where n k (t) denotes the random noise vector received by the kth subarray. 


(5.6.11) 


and 


A k = 

=[s k (e 1 ),s k (e 2 ),... 

' s k (e M )] T 

(5.6.12) 

S k (9) = 

r e i2^( 0 ) / e i 2ii ^2( e ) ; 

' e ) 2,tf 0 T L( e )| 

(5.6.13) 


with t^( 0) denoting the propagation delay from the origin to the 1th element in the kth 
subarray. As the kth subarray is comprised of elements k to k + (L - 1), it follows that 


T i (9) = (k — 1)—cos0 
c 


(5.6.14) 


and 


z( 0 ) 


k—cos0 
c 


(5.6.15) 


xJ((0) = (k + L-2)-cos0 
c 


Thus, (5.6.13) becomes 


s k (e) 


l 

gj2rcdcos0 

M 

gj27id(L-l)cos0 


gj27rd(k-l)cos0 


where d-d/k and k denotes the wavelength corresponding to f 0 . 
Substituting (5.6.17) in (5.6.12) and using (5.6.8), 

A k = A<& k_1 

where $ is an M x M diagonal matrix with 


(5.6.16) 


(5.6.17) 


(5.6.18) 
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= m = 1, 2,..., M (5.6.19) 

It follows from (5.6.11) and (5.6.18) that 

x k = A<h k - 1 s(t) + n k (t) (5.6.20) 

Equations (5.6.4) and (5.6.20) imply that 

R k = A<l> k " 1 S((I> k_1 ) H A H + o 2 I (5.6.21) 

where S is the source covariance matrix defined as 

S = E[s(t)s H (t)] (5.6.22) 

For uncorrelated sources, S is given by (2.1.28). For correlated sources, S^j denotes the 
correlation between ith and jth sources. For the correlated source model presented in 
Section 5.1, Sy is given by (5.1.5). 

The following expression for the spatially smooth correlation matrix results after sub¬ 
stituting for R k from (5.6.21) in (5.6.5): 


R = ASA H +crI (5.6.23) 

where 

1 m L() , 

S—r^sK 1 ) 11 (5.6.24) 

L o 

denotes the smoothed sources covariance matrix. 

To understand the effect of spatial smoothing on the correlation between difference 
sources, consider S^. It follows from (5.6.24) that 




k=l 


Equation (5.6.25) along with (5.6.19) imply that 


S i,) = 


h , Y' 1 j2ndi|q(k-l) 


1 = J 

i*j 


J o k=l 


where 


\|/ i( = cos 0j - cos 0j 


(5.6.25) 


(5.6.26) 


(5.6.27) 
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Using 


1 + a + a 2 + ..., a N 1 = 


1-a 

1-a 


(5.6.28) 


i -t -j terms in (5.6.26) simplifies to 


g = s i.j sin7rd ¥ij L 0 
1,J L 0 sinTrdvj/jj 


(5.6.29) 


which reduces as L 0 increases and goes to zero in the limit. 

Thus, the sources progressively get decorrelated as the number of subarrays is increased, 
and the rate of decorrelation depends on the element spacing and the source directions. 


5.6.2 Adaptive Algorithm 

In this section, updating the weights of an array processor from available array samples 
using the spatial smoothing method is discussed [Sha85]. Assume that N samples of the 
kth subarray signal vectors x k (n), n = 1, 2, ..., N are available. It follows then from (5.6.4) 
and (5.6.5) that an estimate of the spatially smoothed correlation matrix is given by 


i L " i N 

lN= r£N£ Xk(n)Xk(n) 

0 k=l n=l 


N L 0 


(5.6.30) 


^££ x k( n ) x k( n ) 


NL n - 

0 n=l k=l 

Using the next sample x k (N + 1), the matrix R N+1 becomes 

1 N+l L„ 

v > 0 n=1 k=1 


(N + 

nI 


' 0 n= i k= i v ) 0 k= i 


x k (N + l)x»(N + l) (5.6.31) 


^^- + - - 1 —— V x k (N + l)x k (N + l) 

N + l (N + 1)L 0 ^ kV ' kV ’ 


Thus, using (5.6.31), the spatially smoothed correlation matrix can be updated as new 
samples arrive and the new matrix can be used to update the weights. For example, this 
matrix can be used to estimate the power surface gradient, and the gradient-based adaptive 
algorithm discussed previously can be employed to update the weights of an array processor. 

Equation (5.6.31) can also be employed to update the inverse of the correlation matrix 
by making successive use of the Matrix Inverse Lemma and the weights can be estimated 
using the sample inversion algorithm as discussed in Chapter 3. 
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5.7 Structured Beamforming Method 

In this section, the use of a structured correlation matrix for estimating optimal weights 
is discussed, and the decorrelation effect of this technique on the correlated environment 
is examined [God90]. For ease of analysis, only two sources are assumed to be correlated. 
The presence of other uncorrelated sources does not alter the analysis. 

For the linear array of equispaced receivers immersed in a homogeneous noise field, 
the array correlation matrix has a Toeplitz structure, that is, the entries along any diagonal 
are equal. In the presence of correlated sources, the array correlation matrix does not have 
this structure. The technique proposed here uses an estimate of the array correlation matrix 
constrained to have this structure. This constraint is implemented by averaging the uncon¬ 
strained array correlation matrix along the diagonals. Let this matrix, referred to as the 
structured correlation matrix, be denoted by R. The entries along the mth diagonal of R 
are given by 


L-m 

R = —V R 1U m = 0,1,..., L-l 

m L-m^rf 1A+m 


(5.7.1) 


Using the structured correlation matrix, the following expression is obtained for the 
weights of the optimal ESP with unity constraint in the look direction: 


- R's 0 

W = =— 

s“R _ 1 s 0 

The mean output power of the processor for a given iv is given by 


(5.7.2) 


P(w) = w h Rw (5.7.3) 

It should be noted that the use of the structured correlation matrix has been made only 
in the feedback loop to calculate the weights of the processor and not in estimating the 
output power. The structured correlation matrix can be used in obtaining the weights of 
the optimal PIC processor by replacing R by R in (5.3.2). 


5.7.1 Decorrelation Analysis 

The decorrelation effect of this method, referred to as the structured method, is examined 
[God90] in this section. Rewrite (5.4.2) in the following form: 

R = [p s S 0 S? + Pi^Sf + o n 2 l] + x /p s p,[5S 0 S;' + 8* S : S«] (5.7.4) 

The term in the first set of square brackets on the RHS of (5.7.4) is not a function of the 
correlation coefficient, has a Toeplitz structure, and is not affected by averaging along the 
diagonals. The term in the second set of square bracket depends on 8 and does not have 
a Toeplitz structure. Thus, it is sufficient to examine the effect of averaging along the 
diagonals on this term. Let this term be denoted by Q, that is. 
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( 5 . 7 . 5 ) 


Q = 8S 0 S| I + S*S I Sj I 

For an equispaced linear array, the 1th component of a steering vector associated with 0 
is given by 


(S) 1 = e j2 * a(l " 1)c “ 9 (5.7.6) 

where d is the spacing between the elements measured in wavelengths, and 0 is the 
direction of a source relative to the line of the array. 

In writing (5.7.6), Element 1 is taken as the time reference. Consider the (l,k)th element 
of S 0 Sj 


(S 0 Sf) ik = exp|j2jtd[(l- l)cos0 o - (k - l)cos0,]} 


(5.7.7) 


with 0 O and 0 , respectively denoting the direction of the signal and the interference relative 
to the line of the array. 

Let 


> = cos0 o -cos0j 


(5.7.8) 


Then the mth diagonal of S 0 Sj is given by 


(S 0 Sf )^ + = exp|-j27rdmcos0 1 jexpjj27id(l-l)(|) 
1=1, 2,..., L-m 


(5.7.9) 


Let q m denote the average of the mth diagonal. It follows then from (5.7.9) that 


1 

q m = exp(-j2mim cos 0 : j-^ exp Jj27id( 1-1)(|) 


(5.7.10) 


Using the identity 


1 + a + a" +... + a = 


l-a N 

1-a 


(5.7.11) 


from (5.7.10), 


q m =exp|-j27rdmcos0,j 


l l-exp|j27rd(L-m)(|)j 
L-m l-exp|j27rd(|)j 


(5.7.12) 


is obtained, which, after manipulation, leads to 


q m = exp(j7rd(L-I)(|))exp|-j7idm(cos0 1 + cos0„)| sin7t ( L 

v 1 1 (L-mjsmd(|) 


(5.7.13) 
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Now consider S , S' n r . Let cj nl denote the average of the mth diagonal of this matrix, that is. 


1 v ^ 1 

dL = exp|-j27rdm cos 0 O j -^ exp|-j27rd( 1- !)(]) j 


(5.7.14) 


which, after manipulation, leads to 


q m = ex P 


|-jmi(L - l)(])j exp|-j7rdm(cos0j + cos 0 o )j 


sin7t(L-m)d(|) 

(L-m)sin7rd(|) 


(5.7.15) 


It follows from (5.7.5), (5.7.10), and (5.4.14) that the entries along the mth diagonal of 
the structured matrix Q, the term in the second set of square brackets in (5.7.4), is given by 


6 =8q + 8* q 1 

^~-m lm him 

which, along with (5.7.13) and (5.7.15), imply that 


(5.7.16) 


Q m = exp -j7rdm(cos 0, + cos 0 O )J 


sin7t(L-m)d(|) 
(L-m)sin7rd(|) 

|8exp|j7rd(L - l)(f)| + 8* exp {-jjid(L-l)<j)}) 


(5.7.17) 


If |8| and 8p, respectively, denote the magnitude and phase of the correlation coefficient 
measured at the reference point. Element 1 in the present case (5.7.17) reduces to 


,x S ■ j t n . \i 2|8|sin7c(L-m)d<|> 

Q m = ex Pl -jrcdm(cos 0 : + cos 0 O ) -— — cos 'V 

1 J (L-m)sm7rd(|) 


(5.7.18) 


where 


¥ p = 8 p + 7rd(L-l)(^ (5.7.19) 

is the phase of the correlation coefficient measured at the center of the array. 

Equation (5.7.18) describes the mth diagonal of the component of the structured corre¬ 
lation matrix that depends on the correlation coefficient. From (5.7.18), the following 
observations can be made. For 


»P p =(2n + l)|, n = 0, ±1,..., (5.7.20) 

cos'Lp = 0 and thus Q ir „ m = 0, 1, ..., L - 1 reduce to zero. Thus, for these values of the 
correlation phase, the two sources are completely decorrelated. This result is independent 
of the magnitude of the correlation coefficient. 
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For a given element spacing and source direction, the magnitudes of Q m , m = 0, 1, 

L - 1 behave like a well-known function (sin x)/x, with the zeros given by 

(L-m)d(]) = n, n = 0, ±1,(5.7.21) 

As the number of elements in the array increases, the magnitude of Q m decreases. The 
greatest reduction occurs in the elements of the principal diagonal correspond to m = 0. 
The magnitude of Q 0 is given by 


2|8|sin7tLd(l) 

—-„—- cos T 

L sin d(|) p 


(5.7.22) 


As m increases, the effect of increased elements in the array declines. The last diagonal 
of Q that consists of only one element, Q LL , is not affected. 


5.7.7.1 Examples and Discussion 

For these examples, a linear array with one-half wavelength spacing is used. A unity 
power signal source is assumed to be present broadside to the array. Unless otherwise 
specified, the correlation phase is measured at an end element of the array. 

Figure 5.6 compares the power patterns of the conventional beamformer, optimal beam- 
former, and structured beamformer. Eight interferences are assumed in the directions of 



FIGURE 5.6 

Power pattern of an element space processor using conventional, optimal, and structured beamforming methods 
using a ten-element linear array with one-half wavelength spacing in the presence of eight directional interfer¬ 
ences in directions 25°, 45°, 60°, 108°, 120°, 135°, and 155°, each with unity power. Look direction is 90°, oj = 4 
sources in 45° and 90° are correlated with |S| = 1, S p = 45°. (From Godara, L.C., IEEE Trans. Acoust. Speech Signal 
Process., 38, 1-15, 1990. ©IEEE. With permission.) 
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Output SNR of the element space processor and the PIC processor using structured beamforming method vs. 
the magnitude of the correlation coefficient for a four-element linear array with one-half wavelength spacing in 
the presence of one directional interference of unity power in direction 85°. Correlation phase is 90° measured 
at the center of the array, oj = 0.001. (From Godara, L.C., IEEE Trans. Acoust. Speech Signal Process., 38,1-15,1990. 
©IEEE. With permission.) 

the side-lobes of the conventional pattern. The interference in the 45° direction is correlated 
with the signal source. It is clear from the figure that even in the presence of correlated 
arrivals, a ten-element array using the structured method is capable of nulling eight 
direction sources while maintaining a specified response in the look direction. As expected, 
the increased response of the optimal processor in the direction of the correlated interfer¬ 
ence is clearly visible. 

Figure 5.7 shows the output SNRs of the PIC and the ESP using the structured method 
and compares the result to that of the optimal beamformer. The phase of the correlation 
coefficient measured at the center of the array is assumed to be 90°. The magnitude of the 
correlation has almost no effect on the output SNRs of the two processors when the 
structured method is used. This agrees with the analysis presented in the previous section. 
The output SNR of the optimal beamformer reduces to about -25 dB when the two sources 
are fully correlated. 


5.7.2 Structured Gradient Algorithm 

The structured gradient algorithm uses the structured array correlation matrix to estimate 
the required gradient to update the weights, and is discussed in detail in Section 3.6. In 
this section, an analysis of this algorithm is focused on its use in updating the weights in 
the presence of correlated arrivals. The analysis is presented for an equispaced linear array 
in the presence of two correlated sources [God91]. 
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Let an L-dimensional vector w(n + 1) denote the weights of the ESP updated by the 
structured gradient algorithm, that is. 


w (n +1) = P{ w(n) - pg st (w(n))} + ^ (5.7.23) 

where P is the projection operator given by (3.4.2), and g st (w(n)) denotes the gradient 
estimate defined by (3.6.5) to (3.6.7). It follows from (3.6.5) to (3.6.7) that 

E [g st ( w ( n ))| w ( n )] = 2Rw(n) (5.7.24) 

where R is given by (5.7.1). 

5.7.2.1 Gradient Comparison 

Let g(w(n)) denote the gradient of the mean output power for a given w(n) when the 
sources are not correlated, that is. 


g(w(n)) = 2R 0 w(n) (5.7.25) 

where R 0 denotes the array correlation matrix when sources are not correlated, that is, 8 = 
0. It follows from (5.7.4) and (5.7.5) that 


R = R o + VPsPiQ (5.7.26) 

Let an L-dimensional error vector e(n) denote the difference between the true gradient 
used in the standard LMS algorithm in the absence of the correlated field given by (5.7.25), 
and the expected value of the gradient used by the structured gradient algorithm in the 
presence of correlated field, that is. 


e(n) = g(w(n)) - E[g st (w(n))|w(n)] (5.7.27) 

The normalized norm of the error vector approaches zero in the limit as L —•> °°, that is. 


e"(n)e(n) 

V L 


(5.7.28) 


Equation (5.7.28) is now established. It follows from (5.7.24), (5.7.25), and (5.7.27) that 

e(n) = 2R 0 w(n) - 2Rw(n) (5.7.29) 

Since R is obtained from R by averaging along the diagonals, it follows from (5.7.4) and 
(5.7.5) that 


R o R o + VPsPiQ 


(5.7.30) 
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where Q is a matrix having a Toeplitz structure with the entries along the mth diagonal 
given by 

L-m 

Q m = L_ m L Q ^ m = (U . L “ 1 (5 ' 7 ' 31) 

1 1=1 

Note that the matrix R 0 denotes the array correlation matrix of an equispaced linear 
array immersed in an uncorrelated noise field and thus has a Toeplitz structure. Hence, 
it is not affected by averaging along the diagonals. 

It follows from (5.7.29) and (5.7.30) that 

e(n) = -2,/p s p,Qw(n) (5.7.32) 

Taking the dot product, dividing by L, and taking the limit on both sides, 

lim 6 = 4 PsPl w H (n)Cw(n) (5.7.33) 

L—>°° T 

where 


C = lim 

L— 


Q"Q 

L 


Consider the matrix C. Its (l,n)th element is given by 


k=l 


It follows from (5.7.18) that the (l,k)th element of Q is given by 


where 


and 


Qi,k = a o ex p{-)Po( k - 1 )} 


sin 7rcl<|)(L — |k — l]) 
(L-IM) 


2|5|cos w 

i T p 

sind(|) 


P 0 = 7rd(cos0 I + cos0 o ) 
From (5.7.35) and (5.7.36), it follows that 


(5.7.34) 


(5.7.35) 


(5.7.36) 


(5.7.37) 


(5.7.38) 


C m = a o ex l 


1 L 

p{-jPn(n - l)} ,!im ( ^ 

^°° L k=l 


sin mi^L - |k - ij) sin|jid(|)(L - |n - k) 


( L — | k —: 


(L - |n - k|) 


(5.7.39) 
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Since -1 < sin x < 1 Vx and 1/(L — |n-k|) < 1 for 1 < n < L, 1 < k < L, it follows that 


- 1 < 


sin| 

7id(j)| 

(L _ k -1]) 

• i 

sin 

7Td<j)| 

(L - |n — k) 

( L - 

n - k) 


<1 


Thus, 


-«o a k ^ ex p{)Po( n - !)} ^ a k a 


where 


Since 


CL = lim — Y - r - T 

k l—L^-L- k-i 

k=l 1 ^ 


£riri 

k=l I ' k=l k=l+l 


1 1W1 1 1 

-h L H-+-1-h L H— 

L-1+1 lJ Il-1 L-2 1 


1 11 11 i 

<|1 + —+L +— + — +-+ L +1 

2 lJ II L-l 


s Ei 

k=l 


From (5.7.42) and (5.7.43), it follows that 


L 

a t < lim \ 

K L—>»' 


k=l 


Lk 


= 0 


Along with (5.7.39), (5.7.44) implies that 


C, = OVland Vn 


Thus, C = 0 and it follows from (5.7.33) that 


, im e»e(n) = (| 

L->°° L 


This implies (5.7.28). 


(5.7.40) 


(5.7.41) 


(5.7.42) 


(5.7.43) 


(5.7.44) 


(5.7.45) 


(5.7.46) 
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Thus, the normalized error vector norm approaches zero in the limit as L —•> The error 
vector is the difference between the true gradient used in the standard LMS algorithm in 
the presence of the uncorrelated noise field and the expected value of the gradient used 
by the structured gradient algorithm in the presence of correlated arrivals. Since the use 
of the true gradient in the standard LMS algorithm leads the estimated weight vector to 
the optimal weight vector w, and the processor using w minimizes the total noise when 
the noise field is not correlated, it follows that by using the gradient estimated by the 
structured method, the mean value of the estimated weight would approach to w for an 
infinitely large array Thus, the processor in the presence of correlated arrivals would have 
the same antenna pattern (in the mean sense) as it has in the presence of the uncorrelated 
noise field. Thereby, the correlated jammer would be canceled. 

5.7.2.2 Weight Vector Comparison 

In this section, a comparison is made between the normalized error between the expected 
values of the weights estimated by the standard method when the noise field is not 
correlated, and by the structured method when the noise field is correlated. Let w(n) 
denote the weights estimated by the standard LMS algorithm in the absence of correlation, 
that is, when 8 = 0 and w(n) denotes the weights estimated by structured method. It is 
assumed for the purpose of the comparison that at the nth iteration, both methods have 
the same weight vector, that is, w(n) = w(n). 

Let 


e(n +1) = E[w(n +1)] - E[w(n +1)] 

Now it is shown that 


' ."(n + lftn + l) 
L 


(5.7.47) 


(5.7.48) 


It follows from (5.7.24) and (5.7.26) that 

E [g s t( w ( n )| w ( n ))] = 2R o w ( n ) + 2 vPsPiQ w ( n ) (5.7.49) 

Denoting the gradient estimate of (3.4.4) by g s (w(n)), from (3.4.4) one obtains 

E [g s ( w ( n )| w ( n ))] = 2R o w ( n ) (5.7.50) 

Taking the expected value on both sides of (3.4.1) and (5.7.23), using (5.7.47), (5.7.49), 
(5.7.50), and the fact that both the weight vectors are identical at the nth iteration {w(n) = 
w(n)}, 

e(n +1) = 2pyp s p | PQw(n) (5.7.51) 

where 

w(n) = E[w(n)] (5.7.52) 


Thus, 
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(5.7.53) 


lim 

L-»°° )J 


e H (n + l)e(n + l) ,2 _ H , ,Q H PQ_, x 

—- 1 - - =4:\i PsPl lim w (n) — w(n) 


Now consider limQ H PQ/L. It follows from (3.4.2) that 


limQ Hp Q = ii m Q H Q-i im Q Hs 0 S»Q 

L— L L— L L—L 
It follows from (5.7.34) and (5.7.45) that 

lim ^ = 0 

L->°° L 

Since Q H = Q and (S 0 ) m = exp(j2jtd(m - l)cos 0 O ), it follows from (5.7.36) 


(5.7.54) 


(5.7.55) 


(Q"s„s 0 »Q) ik = 

n=l m=l 
L L 

= ^^ a o e xp{-jP 0 (m- l+k-n)}exp|j27id(m-n)cos9 0 J (5.7.56) 


n=l m=l 

sin 7id(|)(L - |m - ]j) sin 7rcl(|)(L — |k - n|) 
L - |m - l| L - Ik - n| 


Substituting for p o from (5.7.38) and rearranging. 


(q h S 0 S«q) = al exp{-jP 0 (k - l)} 


^exp -j7rd(cos0j-cos0 o )mJ 


sin jtd(|)(L- m -1) 


L - m - : 


(5.7.57) 


Vh r -/ , i sinjtd(|)(L-|n-k|) 

2^ exp|j7rd(cos 0 t - cos 0 O jnj- - -—-- 


Since -1 < sin x < lVx, it follows that 


ex p{jPo(k- l)}(Q H S 0 S”Q) k < 0 Cg ^ exp|-j7rd(cos 0 r - cos 0 o )mj 

m=l 

L 

^ exp|j7rd(cos 0j - cos 0 O )nj 


(5.7.58) 


and 
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(5.7.59) 


L 

exp{j Po ( k - i)}(Q H S 0 S”q) k > -a 2 ^ exp{-j:rd(cos 0, - cos 0 O )m} 

m=l 

L 

^exp|j7id(cos0j -cos0 o )nj 


Using 


3 + 3 +3. + L +3 — 3 


Ml 

(l-a) 


(5.7.60) 


to sum the series on the RHS of (5.7.58) and (5.7.59), dividing by L 2 on both sides, and 
taking the limit as L —•> °°, 


ex 


0 < lim 

L->~ 


p{jPo( k - 


Hq qH, 


q) 


<0 


This implies that 


lim 

L— 


Q h s 0 s»Q 

L 2 


= 0 


From (5.7.54), (5.7.55) and (5.7.62) it follows that 


lim 

L— 


q h pq 

L 


= 0 


(5.7.61) 


(5.7.62) 


(5.7.63) 


This along with (5.7.53) establishes (5.4.48). 

The implication of this result is that when the array has an infinitely large number of 
elements, the structured LMS algorithm in the presence of correlated arrivals yields the 
same weight vector in the mean sense as estimated by the standard LMS algorithm when 
the sources are not correlated. 

The structured gradient algorithm analyzed above only uses a snapshot available at the 
(n +l)st iteration of the weight update to estimate the gradient. The improved LMS 
algorithm discussed in Section 3.8 makes use of all available samples to estimate a gradient 
required to update the array weights. Results similar to those given by (5.7.28) and (5.7.48) 
for the structured gradient algorithm may also be established for the improved LMS 
algorithm following a procedure similar to that discussed above. 

Now some numerical examples are presented to compare the performance of the struc¬ 
tured gradient algorithm and the improved algorithm with that of the standard LMS 
algorithm in the presence of correlated field. 


5.7.2.3 Examples and Discussion 

Figure 5.8 and Figure 5.9 compare the mean output power P(w(n)) and the output SNR 
vs. the iteration number when the weights are adjusted using the three algorithms. A 
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FIGURE 5.8 

Output power vs. the iteration number for a ten-element linear array with one-half wavelength spacing in the 
presence of two directional interferences with direction = 65°, power = 1, |8| = 0.99, 8 p = 45° and direction = 72°, 
power = 100. Look direction is 90° with signal power = 1. oj = 0.01. (From Godara, L.C., J. Acoust. Soc. Am., 89, 
1730-1736, 1991. With permission.) 

linear array of ten elements with one-half wavelength spacing is assumed for these exam¬ 
ples. That variance of uncorrelated noise present on each element is assumed to be 0.01. 
Two interference sources are assumed to be present. The first interference makes an angle 
of 65° with the line of the array, and is correlated with the signal source present broadside 
to the array. The magnitude of correlation is taken to be equal to 0.99, and the correlation 
phase measured at the reference point (one of the side elements of the array) is equal to 
45°. The second interference makes an angle of 72° with the line of the array, and is not 
correlated with the signal source. The power of the signal source, as well as of the 
correlated interference, is 20 dB above the white noise power. The power of the second 
interference is 40 dB above the white noise power. All algorithms are initialized with the 
conventional weights, that is. 


w(o)=A 

and the gradient step size |i in each case is taken to be equal to 0.00005. 
The mean output power for a given w(n) is calculated using 

P(w(n)) = w H (n)Rw(n) 
and the output SNR is calculated using 


SNR = 


P s (w(n)) 

P N (w(n)) 


(5.7.64) 


(5.7.65) 


(5.7.66) 
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Iteration Number 


FIGURE 5.9 

Output SNR vs. the iteration number for a ten-element linear array with one-half wavelength spacing in the 
presence of two directional interferences with direction = 65°, power = 1, |8| = 0.99, 8 p = 45° and direction = 72°, 
power = 100. Look direction is 90° with signal power = 1. oj = 0.01. (From Godara, L.C., /. Acoust. Soc. Am., 89, 
1730-1736, 1991. With permission.) 

with P s (w(n)) and P N (w(n)), respectively, denoting the mean output signal power and the 
total mean output noise power for a given w(n). P s (w(n)) is calculated using 

P s (w(n)) = w H (n)R s w(n) (5.7.67) 

where R s is the correlation matrix due to signal only, that is. 


and 


R s =E[x s (t) x »(t)] 


(5.7.68) 


x s(t) = m s( t )(v l P^ S o + 8 * vPi S i) (5.7.69) 

Note that x s (t) is the array signal vector contributed by the signal source. 

The mean output noise power is calculated using 

P N ( W ( n )) = wH ( n ) R N W ( n ) 

where R N is the noise only array correlation matrix. It is given by 

R N = E[x N (t)x»(t)] (5.7.70) 
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with 


%(t) = m i(t) V 1 - | § r s i + + n(t) (5.7.71) 

where mf and Sf characterize the second interference. 

Figure 5.8 shows that the mean output power of the processor is close to the input signal 
power when the structured LMS algorithm and the improved LMS algorithm are used to 
update the weights. Thus, in the presence of the correlated arrivals the processor is able 
to cancel the correlated directional interference without canceling the desired signal. This 
agrees with the theoretical results presented previously. The processor using the standard 
LMS algorithm to update the weights cancels the desired signal and the mean output 
power falls below the level of the input signal power, as expected. 

Figure 5.9 compares the output SNR for the three algorithms. The output SNR achievable 
by the processor using the structured method and the improved method is much higher 
than by the standard LMS case. There are lots of fluctuations in the output SNR curve of 
the structured method compared to that of the improved method. The reason for these 
fluctuations is that the structured method uses only one sample to estimate the gradient 
in comparison to all available samples used by the improved method. Thus, the correlated 
interference can be canceled and the close proximity to the convergence point is quickly 
attained by using the improved method to update the weights of an adaptive beamformer. 
It should be noted here that though the theoretical results are presented to show the 
performance of these algorithms for an infinitely large array, the example presented for a 
ten-element array demonstrates the correlated jammer cancelation capability of these 
algorithms for an array that is not so large. 


5.8 Correlated Broadband Sources 

In this section, an array processor using the tapped delay line (TDL) structure of Figure 4.1 
is considered in the presence of correlated broadband directional sources. The structured 
beamforming method is proposed to cancel the correlated interferences using a linear 
array of equispaced elements, and its performance is analyzed [God92], 

The array correlation matrix for an equispaced linear array using a TDL filter has a 
special structure in the presence of uncorrelated directional sources, and the correlated 
field destroys this structure. This structure is examined in the next section. 


5.8.1 Structure of Array Correlation Matrix 

Consider a linear array of L equispaced elements immersed in a homogeneous and uncor¬ 
related noise field. For ease of analysis, assume that the array is aligned with the positive 
x-axis and that one of the elements is situated at the origin. Let the origin of the coordinate 
system be taken as the time reference. Thus, the time taken by a plane wave arriving from 
direction 0 and measured from Element 1 to the origin is given by 


i l( e)= d ( 1 ~ 1)cose 


(5.8.1) 
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where d is the spacing between the elements and c is the speed of propagation. It is 
assumed that the spacing is less than a half-wavelength at all frequencies of interest. Let 
an L-dimensional vector x(t) denote the sensor output after presteering delays T,(0 O ), 1 = 
1, 2, ..., L. These delays are selected such that the L output waveforms of the presteered 
sensors due to a broadband source in the look direction are identical. As discussed 
previously, an array may be presteered in direction 0 O using 

T 1 (0 o ) = T o + x 1 (0 o ), 1=1,2,..., L (5.8.2) 


where T 0 is a bulk delay, such that 


T^^O, VI (5.8.3) 

Let an LJ dimensional vector X(t) defined by (4.1.6) denote the array signals across the 
TDL structure and R 0 denote the array correlation matrix in the absence of source corre¬ 
lation. Let R 0 (m,n) denote the (m,n)th block of R 0 given by 


R 0 (m,n) = E[x(t - (m - l)T)x T (t - (n - 1)T)] (5.8.4) 

It follows from (4.1.11) that [R 0 (m,n)] lk due to a source in direction 0 is given by 
[ R oKn)] u = p((m-n)T + T 1 (0 o )-T k (0 o ) + x k (0)-x 1 (0)) 

(5.8.5) 

l,k = 1, 2,..., L 


where p(x) denotes the correlation function defined by (4.1.12). 

Let [R 0 (m,n)] M+k , k = 0,1, ..., L -1 denote the kth diagonal of the matrix R 0 (m,n). Thus, 
(5.8.5) can be expressed as 


[ R oK n )] u+k = p((m - n)T + T, - T 1+k + x 1+k - x k ) 
k = 0,1,..., L-l, 1=1,2,.,.,L-k 


(5.8.6) 


where the parameters 0 O and 0 are omitted for the ease of notation. It follows from (5.8.1) 
and (5.8.2) that 


T 1 -x 1 = T 0 +^-(l- l)(cos 0 O - cos 0) (5.8.7) 

and 

T 1+k - x 1+k = T 0 + ^(l+ k - l)(cos0 o - cos0) (5.8.8) 

Equations (5.8.7) and (5.8.8) imply that 

T--L- (T 1+k - T 1+k ) = - ^ k(cos 0 O - cos 0) (5.8.9) 
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Substituting from (5.8.9) in (5.8.6), 


R 0 (m,n)] k =P (m-n)T-k(cos0 o -cos0) 


k = 0,1,L-l, 1=1,2,..., L-k 


(5.8.10) 


As seen on the RHS of (5.8.10), the correlation function parameter only depends on k 
and not on 1. Thus, it follows that all L-k elements of the kth diagonal of the matrix 
R,(m,n) are the same. Hence, each L x L block of the array correlation matrix R 0 (m,n), 
m,n = 1, 2, ..., J has the Toeplitz structure in the absence of correlation. The existence of 
correlation between the directional sources destroys this structure. An array-processing 
method is discussed later in the chapter to restore this structure in the array correlation 
matrix before using it to estimate the weights of the TDL structure. 


5.8.2 Correlated Field Model 

Without any loss of generality, assume that there are two correlated broadband directional 
sources. One source is a signal source and the other source is interference. Let p s and p t 
represent the powers of the signal source and the interference source, respectively. Let 0 O 
and 0j denote the directions of the two sources, respectively. Assume that the interference 
contains a component of the desired signal such that the output of a sensor present at the 
center of the coordinate system, assumed to be the time reference, can be expressed as 


x ( t ) = yPs m s( t ) + A /lh am s( t - T c) + J( 1 “ { ^ m i( t ) +n W 


(5.8.11) 


where m s (t) and m t (t) are zero-mean unit variance, low-pass processes associated with the 
signal source and the interference source, respectively; n(t) is the random noise component 
with a zero mean and variance equals 07 ,; a is a positive real scalar denoting the magnitude 
of correlation, and T c is a real scalar denoting the time delay for the correlated field. For 
two coherent sources, the magnitude of correlation equals 1. It is assumed that m s (t), m [(t), 
and n(t) are mutually uncorrelated. The autocorrelation functions of m s (t) and m r (t) are 
denoted by p s (x) and Pi(x), respectively. 

It should be noted that although the following analysis is for a specialized model of 
(5.8.11) emphasizing a multipath application, the results are equally valid for a more 
general correlated field model of the type 


X W = VPs m s( t )+ \Pi 


am 2 (t) + ,,(l- a 2 )mj(t) 


+ n(t) 


(5.8.12) 


with the cross-correlation 


p(x) = E[m s (t)m 2 (t-x)] (5.8.13) 

that is assumed to be band limited. 

It follows from (5.8.11) that the output of the mth tap on 1th sensor, presteered in the 0 O 
direction, is given by 
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x,(t - (in - 1)T) = vPs m s(t - ( m - 1) T - T 0 ) 

+ ^{am s (t -T c +x 1 -T 1 -(m- 1)T) 

(5.8.14) 

i 1 a 2 m : (t + x 1 - T x - (m - 1 )T)J 
+ n 1 (t-(m-l)T-T 1 ) 

Thus, 


where 


[ R (m, n)]^ = E[ Xl (t - (m - l)T)x k (t - (n - 1)T)] 

= PsPs[( m - n ) T ] + Pi{ a2 Ps[( m - n ) T + T k - T i + T i - T k] 

+ (l'- “ 2 )pi[(m - n )T + x k - x, + T, - T k ]} 

+ p,p : a{p s [x k - T k + (m - n)T - T + T 0 ] 

+ PsK + T, +(m - n)T + T c - T 0 ]} + o n 2 8( 1- k)8(m - n) 


8 (i) = 


0 i*0 

1 i = 0 


(5.8.15) 


(5.8.16) 


It follows from (5.8.15) that the array correlation matrix R in the presence of correlated 
sources can be expressed as 


R = [ R s + Rj + a 2 l] + Q (5.8.17) 

where R s denotes the array correlation matrix due to the signal source in the look direction, 
R , 1 denotes the array correlation matrix due to the interference with the effective autocor¬ 
relation given by 


Pi(x) = a 2 p s (x) + (l-a 2 )p : (x) (5.8.18) 

and Q denotes the array correlation matrix due to the cross-correlation between the signal 
source and the interference. Expressions for Rg, Rj 1 , and Q are given by the first term, 
second term, and third term, respectively, on the RHS of (5.8.15). 

The quantity inside the square brackets on the RHS of (5.8.17) represents the total array 
correlation matrix due to the uncorrelated noise field. Thus, 


R - R 0 + Q 


(5.8.19) 


5.8.3 Structured Beamforming Method 

For a linear array of equispaced elements immersed in a homogeneous and uncorrelated 
noise field, the array correlation matrix has a block Toeplitz structure; that is, each block 
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of an L x L dimensional correlation matrix arising from the correlation of sensor vectors 
at any two taps has the Toeplitz structure. The correlation between two directional sources 
destroys this structure. The structured beamforming method described here uses an esti¬ 
mate of the array correlation matrix with the constraint that the estimated matrix has the 
block Toeplitz structure. Let R denote the array correlation matrix estimated with this 
structure. The weights of the beamformer estimated with the structured method are 
calculated using the following expression: 

W = R- 1 c(c T R _1 c)" 1 f (5.8.20) 

where 

R = R + (31 (5.8.21) 

(3 is a positive scalar selected such that R is positive definite, f is given by (4.1.25), and R 
is the structured correlation matrix. 

An estimate of the structured correlation matrix is made by averaging each block of the 
L x L matrix along its diagonals. Let R(m,n) denote the (m,n)th block of the averaged 
matrix. The entries along the kth diagonal of R(m,n) are given by 

1 L_k 

R(m,n) = ^— ^[ R ( m , n )] il+k , k = 0,1,..., L-l (5.8.22) 


5.8.4 Decorrelation Analysis 

In this section, an analysis is presented to show the decorrelation effect of the structured 
method when the block correlation matrix is estimated by averaging along the diagonals. 
It follows from (5.8.17) and (5.8.19) that the matrix R 0 is not affected by the above method 
since it has the block Toeplitz structure. Thus, it is sufficient to examine the effect of 
averaging along the diagonals on matrix Q. It follows from (5.8.15) that 


[Q(m,n)] u = p s [x k - T k + (m - n)T - T c + T 0 ] 

+ Ps[- T i-T +(m-n)T + T -T 0 ] 

where the constant ajp s p l has been suppressed for ease of analysis. 
Q k (m,n), the kth diagonal of Q(m,n) is given by 

Q k (m,n) = p s [x 1+k - T 1+k + (m - n)T - T + T 0 ] 

+ PsHi + T, +(m - n)T + T c - T 0 ] 


(5.8.23) 


(5.8.24) 


Since 


x 


1 


d 

c 


(l-l)cos0, 


(5.8.25) 
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T 1 = T 0 + -(l-l)cos9 s 


x 1+k = -(l+k-l)cos0j 


and 


T 1+k =T o + -(l+k-l)cos0 s 

it follows from the fact that p s (x) = p s (-x) and (5.8.24) that 

Q k (m,n) = p s [(l+ k -1)\(/ + n - T c 
+p s ([i-i]v-n-T c ) 


where 


and 


\|/ = — (cos0, -cos0 o ) 


r| = (m - n)T 


Thus, from (5.8.22), it follows that 

Q k K n ) = 52 p s[( 1+ k “ %+n - T c 

1=1 

+rr ] 7£ps[( 1 -i)v- T i-\ 


L-k 


Substituting from (4.1.13) in (5.8.32) Q k (m,n), 


Q k (m,n) = —-— f s(f)V e’ 2rf{(1+k ' 1)¥+,1 “ Tc} df 

L k J— oo 


1=1 

L-k 


rM>L 


J2itf{(l-l) V -r 1 -T c 


df 


: J°°S(f)e“ i2ltfrc (' 


e d 2 ”l _|_ e i 2ltf ( k V+n) 


)^_y e> 2,tf(1 - 1)¥ df 

L k 


where S(f) denotes the power spectral density of the desired signal. 


(5.8.26) 

(5.8.27) 


(5.8.28) 


(5.8.29) 


(5.8.30) 


(5.8.31) 


(5.8.32) 


(5.8.33) 
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Using 


1 + a + a 2 +L +a N 1 


l-a N 

1 -a 


(5.8.34) 


in (5.8.33), 


Q k ( m ,n) = J S(f)e 


-j2xfT c 


-j2jrfT| + e j2ltf(k V +ri) | _ e j2ltf(L-k)v 

L-k l-e i27tf¥ 


df 


I 


S(f)exp^-j2jcf(T c -^ V 


2 cos 7 if(k\|/ + 2 r|) sin nf (L - k)\|/ 
(L-k)sin 7 tf\|/ 


df 


Using 


in (5.8.35), 


s(f) = S(-f) 


(5.8.35) 


(5.8.36) 



4S(f) cos27tf\|/ c cos7i:f(ki(/ + 2 r|)sin 7 tf(L-k)\|/ 
L-k sin 7 tf\)/ 


(5.8.37) 


where 


¥ c = T--^^(cos0 I -cos0 o ) (5.8.38) 

is the correlation delay time measured at the center of the array. 

The following result is true for a signal source of finite bandwidth. The result is proved 
later in the discussion. 

If S(f) = 0 outside the frequency range of interest [f[,f H ] and bounded over this finite 
range, then 


lim- 

l->°° L 



(5.8.39) 


where Q(m,n) is the (m,n)th block of the structured correlation matrix arising from the 
correlated source. The above result states that the Hilbert-Schmidt norm of a matrix goes 
to zero as L —> 

The Hilbert-Schmidt norm of a matrix A satisfies the following axiom [Gra77]: 


|A|| = 0, iff A = 0, the all-zero matrix. 


(5.8.40) 
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Thus, it follows that Q(m,n) is an all-zero matrix for an infinitely large array; hence, Q is 
an all-zero matrix in the limit. This along with (5.8.17) implies that in the limit. 


R = R 0 (5.8.41) 

Thus, for an infinitely large array, the effect of correlation is completely canceled using 
the structured beamforming method. 

Although the results presented in this section hold for a large array, numerical examples 
are presented later to show that the method presented here performs satisfactorily for a 
relatively small array. 

The result given by (5.8.39) is now proved. Let 

G / f k \ = 4S(f) cos 2nf\\t c cos nf (ky + 2rQ sin(L - k)y (5 g 

' ' ' sin 7rf\|/ 

First, it is shown that |G(f,k)| is bounded over the frequency range of interest [f|,f H ] for 
every k. It follows from (5.8.30) that 


f\|/ = G (cos0 t -cos9 0 ) (5.8.43) 

A v 

It is assumed that inter-element spacing is less than one-half wavelength at all frequencies, 
that is. 


d 

X 


< 


1 

2 


Equations (5.8.43) and (5.8.44) imply that 

-1 < f\|/ < 1 

From 

O<0j<7t 

O<0 o <7T 

and 

0i * 0 O 


it follows that 


and thus 


COS0j -cos0 o ^ 0 


fv|/*0 


(5.8.44) 

(5.8.45) 

(5.8.46) 

(5.8.47) 

(5.8.48) 

(5.8.49) 

(5.8.50) 
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From (5.8.45) and (5.8.50), 


sin nf v|/ !=■ 0 


(5.8.51) 


As 


|S(f)| < °° 

for the frequency range of interest, it follows from (5.8.42) and (5.8.52) that 

|G(f,k)|<°° Vk 

Now an outline of the proof of the result is presented. Since 


(5.8.52) 


(5.8.53) 


M=i 

we need to show that 


_ 2 2 2 

J2Q(m,n),, =L|Q 0 (m,n)| + JJ(L-k)|Q k (m,n) 


(5.8.54) 


k=l 


2 1 2 

limQ 0 (m,n) + lim —^2(L-k)Q k (m,n) =0 (5.8.55) 


Consider the first term. From (5.8.37) and (5.8.42), it follows that 


I f 2 

lim|Q 0 (m,n)| = lim J ^-G(f,0) 

dhrrfl G ( f -°)l !df 

As |G(f,0) is bounded, the integration yields a finite value. Thus, 

lim|Q 0 (m,n)| = 0 

Now, consider the second term of (5.8.55). From (5.8.37) it follows that 


2v 4 i ff H 2 

second term = lim-—j^J G(f,k)df 


(5.8.56) 


(5.8.57) 


(5.8.58) 


Since G(f,k) is finite for every k, the integral exists. Let it be denoted by V 0 (k). 
Let V 0 be such that 


V 0 (k)<V 0 , k = l, 2,..., L-l 


(5.8.59) 
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From (5.8.58) and (5.8.59), it follows that 


second term = 




(5.8.60) 


= 0 


From (5.8.57) and (5.8.60), it follows that (5.8.55) is true. This completes the proof. 

5.8.4.1 Examples and Discussion 

Figure 5.10 shows power patterns for an eight-element linear array in the presence of six 
directional broadband sources using three beamforming methods. All sources are assumed 
to have the brick-wall type of spectrum with normalized cutoff frequencies of 0.45 and 
0.5. The power of each source is 20 dB above the power of white noise present on each 
element of the array. Five interferences are assumed to be in the far field of the array and 
are in directions of 22°, 50°, 68°, 112°, and 130° relative to the line of the array, and coincide 
with the side-lobes of the conventional array pattern. The signal source is to the array 
broadside. The interference in the direction of 50° is fully correlated with the signal source 
and delayed by 45° at the maximum frequency. The phase delay is specified at the origin 
of the coordinates system with array situated along the x axis. The spacing between the 
elements of the array is taken to be one-half wavelength at the maximum frequency. The 
delay line filter has nine taps (J = 9) with one sample delay between taps. The parameter 
(j of (5.8.21) is taken to be equal to 8. The vector f is selected as follows. 


1 i = 5 
0 otherwise 


£ = 


Figure 5.10 compares the power patterns of the conventional, optimal, and structured 
beamformers. The figure shows that the power pattern of the optimal beamformer has an 
increased response in the direction of the correlated jammer, and this increased response 
is responsible for the cancelation of the look direction signal. The power pattern of the 
structured beamformer shown in plot C has its response about -48 dB in the direction of 
the correlated jammer and has clearly suppressed it. The SNR measured at the output of 
the array using the conventional, optimal, and structured beamformers is 45, 1, and 527, 
respectively. 

Figure 5.11 compares the Hilbert-Schmidt norm of the structured as well as the unstruc¬ 
tured block of the array correlation matrix as a function of the number of elements in the 
array. The L x L dimensional block of the array correlation matrix considered corresponds 
to m = 1 and n = 1. Two sources are considered for the example. The look direction signal 
is broadside to the array and the correlated interference is in the direction of 50° relative 
to the line of the array. The other parameters are the same as in Figure 5.10. As seen in 
the figure, the norm of the structured correlation matrix decreases as the number of the 
elements in the array increases. On the other hand, the norm for the unstructured matrix 
increases. 


© 2004 by CRC Press LLC 




20 



45 90 

Angle in Degrees 


135 


1 80 


FIGURE 5.10 

Power patterns of an element space processor using conventional, optimal, and structured beamforming methods 
using an eight-element linear array with one-half wavelength spacing at the maximum frequency in the presence 
of six directional broadband interferences in directions 22°, 50°, 68°, 90°, 112°, and 130°, each with unity power 
and frequency range (0.45, 0.5). Look direction is 90°, aj = 0.01; sources in 50° and 90° are correlated with 
correlation phase delay of 45° at the maximum frequency measured at the origin. (From Godara, L.C., J. Acoust. 
Soc. Am., 92, 2702-2708, 1992. With permission.) 



Number of Elements 


FIGURE 5.11 

Hilber-Schmidt norm of Q(l,l) vs. the number of elements in the array in the presence of one broadband 
correlated directional interference in directions 50° with unity power over frequency range (0.45, 0.5). The 
correlation phase delay is taken to be 45° at the maximum frequency measured at the origin. Look direction is 
90°, oj = 0.01. (From Godara, L.C., /. Acoust. Soc. Am., 92, 2702-2708, 1992. With permission.) 


© 2004 by CRC Press LLC 





Acknolwedgments 

Edited versions of Sections 5.3, 5.4, and 5.7 are reprinted from Godara, L.C., Beamforming 
in the presence of correlated arrivals using structured correlation matrix, IEEE Trans. 
Aconst. Speech Signal Process., 38(1), 1-15,1990. An edited version of Section 5.5 is reprinted 
from Godara, L.C., Application of antenna arrays to mobile communications, I. Beam¬ 
forming and DOA considerations, Proc. IEEE., 85(8), 1195-1247, 1997. An edited version 
of Section 5.8 is reprinted from Godara, L.C., Beamforming in the presence of broadband 
correlated arrivals, J. Acoust. Soc. Am., 92(5), 2702-2708, 1992. 


Notation and Abbreviations 


ESP 

PIC 

SNR 

SNR(w) 

SNRO 

TDL 

A 

A-k 

d 

d 

e(n) 

e(n+l) 

f 

Gxy(f) 

g(w(n)) 

gst(w(n)) 

J 

L 

Lo 

M 

m k(t) 

m s(t) 

m l(t) 

n(t) 

n k(t) 

P 

P 

P(w) 


element space processor 

postbeamformer interference canceler 

signal-to-noise ratio 

output SNR of optimal PIC 

SNR of optimal ESP and optimal PIC 

tapped delay line 

matrix of steering vectors 

matrix of steering vectors for kth subarray 

spacing between elements 

spacing between elements measured in wavelengths 
difference between g(w(n)) and E[g st (w(n))w(n)] 
difference between E[w(n+1)] and E[w(n+1)] 
constraint vector 
cross-power spectrum 

gradient of mean output power for given w(n) for uncorrelated sources 
gradient of mean output power for given w(n) using structured method 
length of TDL structure 

number of elements used by processor, size of subarray 

number of subarrays 

number of sources 

modulating function of kth source 

unit variance, complex low-pass process of signal source 

unit variance, complex low-pass process of interference source 

noise vector 

noise vector for kth subarray 
projection matrix 

mean output power of optimal ESP 
mean output power of ESP for given w 
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P(w) 

P(w) 

P s (w) 

Pi(w) 

Pn(w) 

Pn(w) 

Ps 

Pi 

Q 

Q 

Qn 

R 

Ro 

Ro(m,n) 

Ro(m,n) 

R k 

R s 

Ri 

R n 

R 

i 

R 

S 

S 

S b 

% 

50 

51 

S(9k) 

s k (0) 

S(f) 

s(t) 

T,(0 O ) 

T c 

U 

V 

w 

w 

w(n) 

w(n) 


mean output power of PIC for given w 
mean output power of optimal PIC 
mean output signal power of optimal PIC 
mean output interference power of optimal PIC 
mean output uncorrelated noise power of optimal PIC 
total mean noise output power of optimal PIC 
power of signal source 
power of interference source 

array correlation matrix due to cross-correlation between signal source and 

interference 

structured matrix of Q 

mth diagonal of Q 

array correlation matrix 

array correlation matrix when sources are not correlated 
(m,n)th block of R 0 
(m,n)th block of R 

array correlation matrix of kth subarray 
array correlation matrix due to signal 
array correlation matrix due to interference 
array correlation matrix due to white noise 
spatially smoothed array correlation matrix 
estimate of spatially smoothed correlation matrix 
structured array correlation matrix 
source correlation matrix 
smoothed sources covariance matrix 
correlation between ith and jth sources 
smoothed correlation between ith and jth sources 
steering vector in signal direction 
steering vector in interference direction 
steering vector in direction 0 k 
steering vector for kth subarray in direction 0 
power spectral density of desired signal 
vector of M modulating functions 
steering delay on 1th element 
correlation delay time measured at origin 
fixed weights of interference beam of PIC 
fixed weights of signal beam of PIC 
weights of ESP 
weights of optimal ESP 

weights estimated by standard algorithm in absence of correlation 
weights estimated by structured method 
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w weight of optimal PIC 

X(t) array signals across TDL structure 

x(t) array signal vector 

x k (t) array signal vector of kth subarray 

<t> M x M diagonal matrix defined by (5.6.19) 

ft complex scalar defined by (5.4.27) 

'Pp phase of correlation coefficient measured at center of array 

\\i c correlation delay time measured at center of array 

a positive real scalar denoting magnitude of correlation 

a 0 scalar defined by (5.7.37) 

a k scalar defined by (5.7.42) 

Po scalar defined by (5.7.38) 

p complex scalar defined by (5.2.8) 

p positive scalar to make R is positive definite in (5.8.21) 

\|/jj scalar defined by (5.6.27) 

\)/ scalar defined by (5.8.30) 

y real scalar defined by (5.4.22) 

r| scalar defined by (5.8.31) 

uncorrelated noise power on each element 
A Lagrange multiplier 

8 xy (f) correlation between two broadband signals x(t) and y(t) 

8 correlation between signal and interference 

8 p phase of correlation coefficient 8 

9 k direction of kth source 

0 O direction of signal 

0| direction of interference 

(|) scalar defined by (5.7.8) 

1,(0^ delay on 1th element for a source in direction 0 k 

xftO) delay on 1th element in kth subarray for source in direction 0 

p complex scalar defined by (5.2.14) 

PxJt) cross correlation function between x and y 
p t (x) autocorrelation functions of rrylt) 

p s (x) autocorrelation functions of m s (t) 
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Direction-of-Arrival Estimation Methods 


6.1 Spectral Estimation Methods 

6.1.1 Bartlett Method 

6.2 Minimum Variance Distortionless Response Estimator 

6.3 Linear Prediction Method 

6.4 Maximum Entropy Method 

6.5 Maximum Likelihood Method 

6.6 Eigenstructure Methods 

6.7 MUSIC Algorithm 

6.7.1 Spectral MUSIC 

6.7.2 Root-MUSIC 

6.7.3 Constrained MUSIC 

6.7.4 Beam Space MUSIC 

6.8 Minimum Norm Method 

6.9 CLOSEST Method 

6.10 ESPRIT Method 

6.11 Weighted Subspace Fitting Method 
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Notation and Abbreviations 
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The problem of localization of sources radiating energy by observing their signal received 
at spatially separated sensors is of considerable importance, occurring in many fields, 
including radar, sonar, mobile communications, radio astronomy, and seismology. In this 
chapter, an estimation of the direction of arrival (DOA) of narrowband sources of the same 
central frequency, located in the far field of an array of sensors is considered, and various 
DOA estimation methods are described, compared, and sensitivity to various perturba¬ 
tions is analyzed. The chapter also contains discussion of various preprocessing and source 
estimation methods [God96, God97], Source direction is parameterized by the variable 0. 
The DOA estimation methods considered include spectral estimation, minimum-variance 
distortionless response estimator, linear prediction, maximum entropy, and maximum 
likelihood. Various eigenstructure methods are also described, including many versions 
of MUSIC algorithms, minimum norm methods, CLOSEST method, ESPRIT method, and 
the weighted subspace fitting method. 
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6.1 Spectral Estimation Methods 

These methods estimate DOA by computing the spatial spectrum P(0), that is, the mean 
power received by an array as a function of 0, and then determining the local maximas 
of this computed spatial spectrum [Cap69, Lac71, Nut74, Joh82a, Wag84, Zha95, Bar56]. 
Most of these techniques have their roots in time series analysis. A brief overview and 
comparison of some of these methods are found in [Lac71, Joh82a]. 


6.1.1 Bartlett Method 

One of the earliest methods of spectral analysis is the Bartlett method [Lac71, Bar56], in 
which a rectangular window of uniform weighting is applied to the time series data to 
be analyzed. Forbearing estimation problems using an array, this is equivalent to applying 
equal weighting on each element. Thus, by steering the array in 0 direction this method 
estimates the mean power P B (0), an expression for which is given by 

S H R S 

P B (0) = ^J^e (6.1.1) 

where S 0 denotes the steering vector associated with the direction 0, L denotes the number 
of elements in the array, and R is the array correlation matrix. 

A set of steering vectors {S e } associated with various direction 0 is often referred to as 
the array manifold in DOA estimation literature. In practice, it may be measured at the 
time of array calibration. From the array manifold and an estimate of the array correlation 
matrix, P B (0) is computed using (6.1.1). Peaks in P B (0) are then taken as the directions of 
the radiating sources. 

The process is similar to that of mechanically steering the array in this direction and 
measuring the output power. Due to the resulting side-lobes, output power is not only 
contributed from the direction in which the array is steered but from the directions where 
the side-lobes are pointing. The processor is also known as the conventional beamformer 
and the resolving power of the processor depends on the aperture of the array or the 
beamwidth of the main lobe. 


6.2 Minimum Variance Distortionless Response Estimator 

The minimum variance distortionless response estimator (MVDR) is the maximum likeli¬ 
hood method (MLM) of spectrum estimation [Cap69], which finds the maximum likelihood 
(ML) estimate of the power arriving from a point source in direction 0 assuming that all 
other sources are interference. In the beamforming literature, it is known as the MVDR 
beamformer as well as the optimal beamformer, since in the absence of errors, it maximizes 
the output SNR and passes the look direction signal undistorted as discussed in Chapter 
2. For DOA estimation problems, MLM is used to find the ML estimate of the direction 
rather than the power [Mil90]. Following this convention, the current estimator is referred 
to as the MVDR estimator. 
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This method uses the array weights obtained by minimizing the mean output power 
subject to a unity constraint in the look direction. The expression for the power spectrum 

Pmv( 9) is 


p ” (e)= sfik: (62 - 1) 

This method has better resolution properties than the Bartlett method [Cox73], but does 
not have the best resolution properties of all methods [Joh82a], 


6.3 Linear Prediction Method 

The linear prediction (LP) method estimates the output of one sensor using linear combi¬ 
nations of the remaining sensor outputs and minimizes the mean square prediction error, 
that is, the error between the estimate and the actual output [Joh82a, Mak75]. Thus, it 
obtains the array weights by minimizing the mean output power of the array subject to 
the constraint that the weight on the selected sensor is unity. Expressions for the array 
weights w and the power spectrum Pu>(0), respectively, are 


and 


w = 


R u, 
u!' ir 1 u 


Plf(9) = 


uf R^Uj 


lup R 


(6.3.1) 


(6.3.2) 


where u, is a column vector such that one of its elements is unity and the remaining 
elements are zero [Joh82a], 

The position of 1 in the column vector corresponds to the position of the selected element 
in the array for predicting its output. There is no criterion for proper choice of this element; 
however, choice of this element affects the resolution capability and bias in the estimate. 
These effects are dependent on the SNR and separation of directional sources [Joh82a]. 
LP methods perform well in moderately low SNR environments and are good compro¬ 
mises in situations where sources are of approximately equal strength and are nearly 
coherent [Kes85]. 


6.4 Maximum Entropy Method 

The maximum entropy (ME) method finds a power spectrum such that its Fourier trans¬ 
form equals the measured correlation subjected to the constraint that its entropy is max¬ 
imized [Bur67]. The entropy of a Gaussian band-limited time series with power spectrum 
S(f) is defined as 
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(6.4.1) 


f N 

H(S)= J lnS(f)df 

_f N 


where f N is the Nyquist frequency. 

For estimating DOA from the measurements using an array of sensors, the ME method 
finds a continuous function P MK (0) > 0 such that it maximizes the entropy function 


271 

H(P) = JlnP ME (e)de (6.4.2) 

0 

subject to the constraint that the measured correlation between the ith and the jth elements 
rjj satisfies 

271 

r ij = JPME(e)cos(27rT ij (0))de (6.4.3) 

o 

where Xjj(0) denotes the differential delay between elements i and j due to a source in 0 
direction. 

The solution to this problem requires an infinite dimensional search. The problem may 
be transformed to a finite dimensional search using the duality principle [McC83] leading 
to 


Pme(6) = 


w T q(0) 


In (6.4.4), w is obtained by minimizing 


(6.4.4) 


2n 


H(w) = J ln(w T q(0))d0 

o 

(6.4.5) 

subject to 


w T r = 2n 

(6.4.6) 

and 


w T q(0)>O V0 

(6.4.7) 

where q(0) and r, respectively, are defined as 


q(0) = [l, V2cos(27rfT 12 (0)),...] T 

(6.4.8) 

and 


r = [r n , V2r 12 ,...] T 

(6.4.9) 


It should be noted that the dimension of these vectors depends on the array geometry 
and is equal to the number of known correlations r^ for every possible i and j. 
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The minimization problem defined above may be solved iteratively using the standard 
gradient LMS algorithm. For more information on various issues of the ME method, see 
[Nag94, Ski79, Tho80, McC82, Lan83, Far85]. Suitability of the ME method for mobile 
communications in fast-fading signal conditions has been studied by [Nag94]. 


6.5 Maximum Likelihood Method 

The MLM estimates the DO As from a given set of array samples by maximizing the log- 
likelihood function [Mil90, Lig73, Sch68, Zis88, Sto90, Oh92, Lee94, Wu94a, She96]. The 
likelihood function is the joint probability density function of the sampled data given the 
DO As and viewed as a function of the desired variables, which are the DO As in this case. 
The method searches for those directions that maximize the log of this function. The ML 
criterion signifies that plane waves from these directions are most likely to cause the given 
samples to occur [Hay85]. 

Maximization of the log-likelihood function is a nonlinear optimization problem, and 
in the absence of a closed-form solution requires iterative schemes. There are many such 
schemes available in the literature. The well-known gradient descent algorithm using the 
estimated gradient of the function at each iteration as well as the standard Newton-Raphson 
method are well suited for the job [Wax83]. Other schemes, such as the alternating pro¬ 
jection method [Zis88, Oh92] and the expectation maximization algorithm [Mil90, Dem77, 
Hin81], have been proposed for solving this problem in general as well as for specialized 
cases such as unknown polarization [Lee94a], unknown noise environments [Wu94], and 
contaminated Gaussian noise [Lig73]. A fast algorithm [Aba85] based on Newton's method 
developed for estimating frequencies of sinusoids maybe modified to suit DOA estimation 
based on ML criteria. 

The MLM provides superior performance compared to other methods particularly when 
SNR is small, the number of samples is small, or the sources are correlated [Zis88], and 
thus is of practical interest. For a single source, the estimates obtained by this method are 
asymptotically unbiased [Lee94a], that is, the expected values of the estimates approach 
their true values in the limit as the number of samples used in the estimate increase. In 
that sense, it may be used as a standard to compare the performance of other methods. 
The method normally assumes that the number of sources, M, is known [Zis88]. 

When a large number of samples is available, other computationally more efficient 
schemes may be used with performance almost equal to this method [Sto90]. Analysis of 
the method to estimate the direction of sources when the array and the source are in 
relative motion to each other indicates its potential for mobile communications [Wig95, 
Zei95]. 


6.6 Eigenstructure Methods 

These methods rely on the following properties of the array correlation matrix: (1) The 
space spanned by its eigenvectors may be partitioned in two subspaces, namely the signal 
subspace and the noise subspace; and (2) The steering vectors corresponding to the 
directional sources are orthogonal to the noise subspace. As the noise subspace is orthog¬ 
onal to the signal subspace, these steering vectors are contained in the signal subspace. It 
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should be noted that the noise subspace is spanned by the eigenvectors associated with 
the smaller eigenvalues of the correlation matrix, and the signal subspace is spanned by 
the eigenvectors associated with its larger eigenvalues. 

In principle, the eigenstructure-based methods search for directions such that the steer¬ 
ing vectors associated with these directions are orthogonal to the noise subspace and are 
contained in the signal subspace. In practice, the search may be divided in two parts. First, 
find a weight vector w that is contained in the noise subspace or is orthogonal to the 
signal subspace, and then search for directions such that the steering vectors associated 
with these directions are orthogonal to this vector. The source directions correspond to 
the local minima of the function |w H S e |, where S e denotes a steering vector. 

When these steering vectors are not guaranteed to be in the signal subspace there may 
be more minima than the number of sources. The distinction between the actual source 
direction and a spurious minima in |w H S e | is made by measuring the power in these 
directions. 

Many methods have been proposed that utilize the eigenstructure of the array correla¬ 
tion matrix. These methods differ in the way that available array signals have been utilized, 
required array geometry, applicable signal model, and so on. Some of these methods do 
not require explicit computation of the eigenvalues and eigenvectors of the array correla¬ 
tion matrix, whereas in others it is essential. Effective computation of these quantities may 
be done by methods similar to those described in [Tuf86]. When the array correlation 
matrix is not available, a suitable estimate of the matrix is made from available samples. 

One of the earliest DOA estimation methods based on the eigenstructure of covariance 
matrix was presented by Pisarenko [Pis73], and has better resolution than the minimum 
variance, ME, and LP methods [Wax84]. A critical comparison of this method with two 
other schemes [Red79, Can80] applicable for a correlated noise field has been presented 
in [Bor81] to show that the Pisarenko's method is an economized version of these schemes, 
restricted to equispaced linear arrays. The scheme presented in [Red79] is useful for off¬ 
line implementations similar to those presented in [Joh82, Bro83], whereas the method 
described in [Can80] is useful for real-time implementations and uses normalized gradient 
algorithm to estimate a vector in the noise subspace from available array signals. Other 
schemes suitable for real-time implementation are discussed in [Red82, Yan88, Lar83]. A 
scheme known as the matrix pencil method, shown by [Oui89] to be similar to Pisarenko's 
method, has been described in [Oui88]. 

Eigenstructure methods may also be used for finding DO As when the background noise 
is not white but has a known covariance [Pau86] unknown covariance [Wax92], or when 
the sources are in the near field and/or the sensors have unknown gain patterns [Wei88]. 
For the latter case, the signals induced on all elements of the array are not of the equal 
intensity, as is the case when the array is in the far field of the directional sources. The 
effect of spatial coherence on resolution capability of the these methods is discussed in 
[Bie80], whereas the issue of the optimality of these methods is considered in [Bie83]. In 
the following, some popular schemes are described in detail. 


6.7 MUSIC Algorithm 

The multiple signal classification (MUSIC) method [Sch86] is a relatively simple and 
efficient eigenstructure variant of DOA estimation methods. It is perhaps the most studied 
method in its class and has many variations. Some of these are discussed in this section. 
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6.7.1 Spectral MUSIC 

In its standard form, also known as spectral MUSIC, the method estimates the noise 
subspace from available samples. This can be done either by eigenvalue decomposition 
of the estimated array correlation matrix or singular value decomposition of the data 
matrix with its N columns being the N array signal vector samples, also known as 
snapshots. The latter is preferred for numerical reasons [DeG93]. 

Once the noise subspace has been estimated, a search for M directions is made by looking 
for steering vectors that are as orthogonal to the noise subspace as possible. This is 
normally accomplished by searching for peaks in the MUSIC spectrum given by 

C6.7.1) 

| s e u n 

where U N denotes an L by L - M dimensional matrix, with L - M columns being the 
eigenvectors corresponding to the L - M smallest eigenvalues of the array correlation 
matrix and S e denoting the steering vector that corresponds to direction 0. 

It should be noted that instead of using the noise subspace and searching for directions 
with steering vectors orthogonal to this subspace, one could also use the signal subspace 
and search for directions with steering vectors contained in this space [Bar83]. This 
amounts to searching for peaks in 


P MU (e) = |U»S 6 | 2 (6.7.2) 

where U s denotes an L x M dimensional matrix with its M columns being the eigenvectors 
corresponding to the M largest eigenvalues of the array correlation matrix. 

It is advantageous to use the one with smaller dimensions. For the case of a single 
source, the DOA estimate made by the MUSIC method asymptotically approaches the 
Cramer-Rao lower bound, that is, where the number of snapshots increases infinitely, the 
best possible estimate is made. For multiple sources, the same holds for large SNR cases, 
that is, when the SNR approaches infinity [Fri90, Por88]. The Cramer-Rao lower bound 
(CRLB) gives the theoretical lowest value of the covariance for an unbiased estimator. 

In [Klu93], an application of the MUSIC algorithm to cellular mobile communications 
was investigated to locate land mobiles, and it is shown that when multipath arrivals are 
grouped in clusters the algorithm is able to locate the mean of each cluster arriving at a 
mobile. This information then may be used to locate line of sight. Its use for mobile satellite 
communications has been suggested in [Geb95]. 

6.7.2 Root-MUSIC 

For a uniformly spaced linear array (ULA), the MUSIC spectra can be expressed such that 
the search for DOA can be made by finding the roots of a polynomial. In this case, the 
method is known as root-MUSIC [Bar83]. Thus, root-MUSIC is applicable when a ULA is 
used and solves the polynomial rooting problem in contrast to spectral MUSIC'S identi¬ 
fication and localization of spectral peaks. Root-MUSIC has better performance than 
spectral MUSIC [Rao89a]. 

6.7.3 Constrained MUSIC 

This method incorporates the known source to improve estimates of the unknown source 
direction [DeG93]. The situation arises when some of the source directions are already 
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known. The method removes signal components induced by these known sources from 
the data matrix and then uses the modified data matrix for DOA estimation. Estimation 
is achieved by projecting the data matrix onto a space orthogonal complement to a space 
spanned by the steering vectors associated with known source directions. A matrix oper¬ 
ation, the process reduces the signal subspace dimension by a number equal to the known 
sources and improves estimate quality, particularly when known sources are strong or 
correlated with unknown sources. 

6.7.4 Beam Space MUSIC 

The MUSIC algorithms discussed so far process the snapshots received from sensor ele¬ 
ments without any preprocessing, such as forming beams, and thus may be thought of as 
element space algorithms, which contrasts with the beamspace MUSIC algorithm in which 
the array data are passed through a beamforming processor before applying MUSIC or 
any other DOA estimation algorithms. The beamforming processor output may be thought 
of as a set of beams; thus, the processing using these data is normally referred to as 
beamspace processing. A number of DOA estimation schemes are discussed in [May87, 
Kar90], where data are obtained by forming multiple beams using an array. 

The DOA estimation in beam space has a number of advantages such as reduced 
computation, improved resolution, reduced sensitivity to system errors, reduced resolu¬ 
tion threshold, reduced bias in the estimate, and so on [Fri90, Lee90, Xu93, Zol93, Zol93a]. 
These advantages arise from the fact that a beamformer is used to form a number of beams 
that are less than the number of elements in the array; consequently, less data to process 
a DOA estimation are necessary. 

This process may be understood in terms of array degrees of freedom. Element space 
methods have degrees of freedom equal to the number of elements in the array, whereas 
the degrees of freedom of beamspace methods are equal to the number of beams formed 
by the beamforming filter. Thus, the process reduces the array's degrees of freedom. 
Normally, only M + 1 degrees of freedom to resolve M sources are needed. 

The root-MUSIC algorithm discussed for the element space case may also be applied to 
this case, giving rise to beamspace root-MUSIC [Zol93, Zol93a]. Computational savings 
for this method are the same as for beamspace methods compared to element space 
methods in general. 


6.8 Minimum Norm Method 


Minimum norm method [Red79, Kum83] is applicable for ULA, and finds the DOA 
estimate by searching for peak locations in the spectrum [Erm94], as in the following 
expression: 


Pmn(9) = 



( 6 . 8 . 1 ) 


where w denotes an array weight such that it is of the minimum norm, has first element 
equal to unity, and is contained in the noise subspace. The solution to the above problem 
leads to the following expression for the spectrum [Erm94, Nic88, Cle89]: 
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( 6 . 8 . 2 ) 


Pmn(9) 


s^u N u» ei 


where the vector e, contains all zeros except the first element, which is equal to unity. 

Given that the method is applicable for ULA, the optimization problem to solve for the 
array weight may be transformed to a polynomial rooting problem, leading to a root- 
minimum-norm method similar to root-MUSIC. A performance comparison [Kri92] indi¬ 
cated that the variance in the estimate obtained by root-MUSIC is smaller than or equal 
to that of the root-minimum-norm method. Schemes to speed up the DOA estimation 
algorithm of the minimum norm and to reduce computations are discussed in [Erm94, 
Ng90]. 


6.9 CLOSEST Method 

The CLOSEST method is useful for locating sources in a selected sector. Contrary to 
beamspace methods, which work by first forming beams in selected directions, CLOSEST 
operates in the element space and in that sense it is an alternative to beamspace MUSIC. 
In a way, it is a generalization of the minimum-norm method. It searches for array weights 
in the noise subspace that are close to the steering vectors corresponding to DOAs in the 
sector under consideration, and thus its name. Depending on the definition of closeness, 
it leads to various schemes. A method referred to as FINE (First Principal Vector) selects 
an array weight vector by minimizing the angle between the selected vector and the 
subspace spanned by the steering vectors corresponding to DOAs in the selected sector. 
In short, the method replaces the vector e, used in the minimum-norm method by a 
suitable vector depending on the definition of closeness used. For details about the selec¬ 
tion of these vectors and the relative merits of the CLOSEST method, see [Buc90]. 


6.10 ESPRIT Method 

Estimation of signal parameters via rotational invariance techniques (ESPRIT) [Roy89] is 
a computationally efficient and robust method of DOA estimation. It uses two identical 
arrays in the sense that array elements need to form matched pairs with an identical 
displacement vector, that is, the second element of each pair ought to be displaced by the 
same distance and in the same direction relative to the first element. 

However, this does not mean that one has to have two separate arrays. The array 
geometry should be such that the elements could be selected to have this property. For 
example, a ULA of four identical elements with inter-element spacing d may be thought 
of as two arrays of three matched pairs, one with first three elements and the second with 
last three elements such that the first and the second elements form a pair, the second and 
the third elements form another pair, and so on. The two arrays are displaced by the 
distance d. The way ESPRIT exploits this subarray structure for DOA estimation is now 
briefly described. 
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Let the signals induced on the 1th pair due to a narrowband source in direction 0 be 
denoted by x,(t) and y,(t). The phase difference between these two signals depends on the 
time taken by the plane wave arriving from the source under consideration to travel from 
one element to the other. Assume that the two elements are separated by the displacement 
A 0 . Thus, it follows that 


y 1 (t) = x 1 (t)e i27lA » cose (6.10.1) 

where A 0 is measured in wavelengths. 

Note that A 0 is the magnitude of the displacement vector. This vector sets the reference 
direction and all angles are measured with reference to this vector. Let the array signals 
received by the two K-element arrays be denoted by x(t) and y(t). These are given by 

x(t) = As(t) + n x (t) (6.10.2) 

and 

y(t) = A<bs(t) + n y (t) (6.10.3) 

where A is a K x M matrix with its columns denoting the M steering vectors corresponding 
to M directional sources associated with the first subarray, 3>isanMxM diagonal matrix 
with its mth diagonal element given by 

T> = e i2ltA ° cos9m (6.10.4) 

s(t) denotes M source signals induced on a reference element, and n x (t) and n y (t), respec¬ 
tively, denote the noise induced on the elements of the two subarrays. Comparing the 
equations for x(t) and y(t), it follows that the steering vectors corresponding to M direc¬ 
tional sources associated with the second subarray are given by A<1>. 

Let U x and U y denote two K x M matrices with their columns denoting the M eigenvec¬ 
tors corresponding to the largest eigenvalues of the two array correlation matrices R„ and 
Ryy, respectively. As these two sets of eigenvectors span the same M-dimensional signal 
space, it follows that these two matrices U x and U y are related by a unique nonsingular 
transformation matrix \J/, that is. 


U xV = U y (6.10.5) 

Similarly, these matrices are related to steering vector matrices A and A<1> by another 
unique nonsingular transformation matrix T as the same signal subspace is spanned by 
these steering vectors. Thus, 


U x = AT (6.10.6) 

and 

U y = AT>T (6.10.7) 
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Substituting for U x and U y and the fact that A is of full rank, 

lyr 1 = (6.10.8) 

According to this statement, the eigenvalues of \j/ are equal to the diagonal elements of 
<t>, and columns of T are eigenvectors of rj/. 

This is the main relationship in the development of ESPRIT [Roy89]. It requires an 
estimate of from the measurement x(t) and y(t). An eigendecompositon of p provides 
its eigenvalues, and by equating them to 4> leads to the DOA estimates. 


9 m =cos-j A 2 r ^ m ^ |, m = 1,..., M (6.10.9) 

The ways in which estimates of \|/ were efficiently obtained from the array signal 
measurements led to many versions of ESPRIT [Roy89, Xu94, Ham94, Roy86, Pau86a, 
Wei91]. The one summarized below is referred to as total least squares (TLS) ESPRIT 
[Roy89, Xu94], 

1. Make measurements from two identical subarrays that are displaced by A 0 . Esti¬ 
mate the two array correlation matrices from the measurement and find their 
eigenvalues and eigenvectors. 

2. Find the number of directional sources M using available methods; some are 
described in Section 6.14. 

3. Form the two matrices with their columns being the M eigenvectors associated 
with the largest eigenvalues of each correlation matrix. Let these be denoted by 
U x and Uy For a ULA, this could be done by first forming an L x M matrix U, by 
selecting its columns as the M eigenvectors associated with the largest eigenvalues 
of the estimated array correlation matrix of the full array of L elements. Then 
select the first K < L rows of U to form U x and the last of its K rows to form U v . 

4. Form a 2M x 2M matrix 


U 

U 


H 

X 

H 

y 


u u 


and find its eigenvalues > ... > Let A be a diagonal matrix: 


A = 


( 6 . 10 . 10 ) 


( 6 . 10 . 11 ) 


Let the eigenvectors associated with A, x > ... > be the columns of a matrix V 
such that 


U 

U 


H 

X 

H 

y 


[U x U y ] = VAV h 


( 6 . 10 . 12 ) 
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5. Partition V into four matrices of dimension M x M as 


lYu V 22 J 

6. Calculate the eigenvalues m = 1, ..., M of the matrix -VnY^ 1 . 

7. Estimate the angle of arrival 0 m , using 


(6.10.13) 


e m = cos-j A ^ m ^ | / m = 1,..., M (6.10.14) 

Other ESPRIT variations include beamspace ESPRIT [Xu94], beamspace ESPRIT for 
uniform rectangular array [Gan96], resolution-enhanced ESPRIT [Ham94], virtual inter¬ 
polated array ESPRIT [Pau86a], multiple invariance ESPRIT [Swi92a], higher-order 
ESPRIT [Yue96], and procrustes rotation-based ESPRIT [Zol89]. Use of ESPRIT for DOA 
estimation employing an array at a base station in the reverse link of a mobile communi¬ 
cation system has been studied in [Wan95]. 


6.11 Weighted Subspace Fitting Method 

The weighted subspace fitting (WSF) method [Vib91, Vib91a] is a unified approach to 
schemes such as MLM, MUSIC, and ESPRIT. It requires that the number of directional 
sources be known. The method finds the DOA such that the weighted version of a matrix 
whose columns are the steering vectors associated with these directions is close to a data- 
dependent matrix. The data-dependent matrix could be a Hermitian square root of the 
array correlation matrix or a matrix whose columns are the eigenvectors associated with 
the largest eigenvalues of the array correlation matrix. The framework proposed in the 
method can be used for deriving common numerical algorithms for various eigenstructure 
methods as well as for their performance studies. WSF application for mobile communi¬ 
cations employing an array at the base station has been investigated in [And91, Klo96]. 


6.12 Review of Other Methods 

In this section, a brief review of methods not covered in detail is provided. A number of 
eigenstructure methods reported in the literature exploit specialized array structures or 
noise scenarios. Two methods using uniform circular arrays presented in [Mat94] extend 
beamspace MUSIC and ESPRIT algorithms for two-dimensional angle estimation, includ¬ 
ing an analysis of MUSIC to resolve two sources in the presence of gain, phase, and location 
errors. Properties of an array have also been exploited in [Swi93] to find the azimuth and 
elevation of a directional source. Two DOA estimation schemes in an unknown noise field 
using two separate arrays proposed in [Wu94a] appear to offer superior performance 
compared to their conventional counterparts. 
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Use of a minimum redundancy linear array offers several advantages as discussed in 
[Zol93]. By using such arrays, one may be able to resolve more than L sources using L 
elements, L(L-l)/2 being the upper limit. A minimum redundancy linear array has 
nonuniform spacing such that the number of sensor pairs measuring the same special 
correlation lag is minimized for a given number of elements. In designing such an array, 
having only one pair with spacing d, one pair with spacing 2d, and so on is perferred, 
such as a three-element array with element positions x 1 - 0, x 2 = d, and x 3 = 3d. The 
minimum redundancy linear arrays are also referred to as augmented arrays [God88]. 

The direction-finding methods applicable to unknown noise field are described in 
[Wax92, LeC89, Won92, Rei92, Ami92], The MAP (maximum a posteriori) method pre¬ 
sented in [Won92, Rei92] is based on Bayesian analysis, and estimated results are not 
asymptotically consistent, that is, the results may be biased [Wu94a], The method in 
[Ami92], referred to as concurrent nulling and location (CANAL), may be implemented 
using analog hardware, thus eliminating the need for sampling, data storage, and so on. 
A DOA estimation method in the presence of correlated arrivals using an array of unre¬ 
stricted geometry is discussed in [Cad88]. Several methods that do not require eigenvalue 
decomposition are discussed in [Rei87, Di85, Xu92, Wei93a, Fuc94, Che94, Yan94a, Sou95]. 

The method proposed in [Rei87] is applicable for a linear array of L elements. It forms 
a K x K correlation matrix from one snapshot with K > M, and is based on the QR 
orthonormal decomposition [Gol83] on this correlation matrix, with Q being a K x K 
unitary matrix and R being upper triangle. The last K - M columns of Q define a set of 
orthonormal basis for the noise space. Denoting these columns by U N , the source directions 
are obtained from power spectrum peaks: 


P(e)=nr^ (6.12.1) 

|s?u N | 

The method is computationally efficient and the performance is comparable to MUSIC 
[Rei87]. A multiple-source location method based on the matrix decomposition approach 
is presented in [Di85]. The method requires the knowledge of the noise power estimate, 
and is applicable for coherent as well as noncoherent arrivals. It does not require knowl¬ 
edge of number of sources. 

The method discussed in [Xu92] exploits the cyclostationarity [Gar91] of data that may 
exist in certain situations. The method has significant implementation advantages and its 
performance is comparable with the other methods. A method is discussed in [Wei93a] 
that is based on polynomial rooting estimates DOA with high resolution and has low 
computation requirements; it exploits the diversity polarization of an array. Such arrays 
have the capability of separating signals based on polarization characteristics, and thus 
have an advantage over uniformly polarized arrays [Fer83, Zis90]. 

An adaptive scheme based on Kalman filtering to estimate noise subspace is presented 
in [Che94], which is then combined with root-MUSIC to estimate DOA. The method has 
good convergence characteristics. The method presented in [Fuc94] uses a deconvolution 
approach to the output of a conventional processor to localize sources, whereas those 
discussed in [Yan94a, Sou95] use a neural network approach to direction finding. 

The discussion on DOA estimation so far has been concentrated on estimating the 
directions of stationary narrowband sources. Although extension of a narrowband direc¬ 
tion-finding scheme to the broadband case is not trivial, some of the methods discussed 
here have been extended to estimate broadband source directions. For discussion of these 
and other schemes, see [Wax84, Su83, Wan85, Swi89, Kro89, Ott90, Cad90, Dor93, Gre94, 
Swi94, Buc88, Hun90]. The methods described in [Su83, Wan85, Swi89, Cad90, Buc88] are 
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based on a signal subspace approach, whereas those discussed in [Ott90, Hun90] and 
[Dor93, Sch93] are related to the ESPRIT method and the ML method, respectively. Appli¬ 
cations of high-resolution direction-finding methods to estimate the directions of moving 
sources and to track these sources are described in [Rao94, Yan94, Liu94, Eri94, Sas91]. 
The problem of estimating the mean DOA of spatially distributed sources such as those 
in base mobile communication systems has been examined in [Men96, Tru96]. 


6.13 Preprocessing Techniques 

Several techniques are used to process data before using direction-finding methods for 
DOA estimation, particularly in situations where directional sources are correlated or 
coherent. Correlation of directional sources may exist due to multipath propagation, and 
tends to reduce the rank of the array correlation matrix as discussed in Chapter 5. The 
correlation matrix may be tested for source coherency by applying the rank profile test 
described in [Sha87]. Most preprocessing techniques either try to restore this rank defi¬ 
ciency in the correlation matrix or modify it to be useful for the DOA estimation methods. 
In this section, some of these techniques are reviewed. 

One scheme referred to as the spatial smoothing method has been widely studied in 
the literature [Sha85, Wil88, Yeh89, Pil89, Lee90a, Mog91, Du91, Mog92, Yan92, Rao93, 
Lio89, Wei93, Eva81], and is applicable for a linear array. Details on spatial smoothing for 
beamforming are provided in Chapter 5. In its basic form, it decorrelates the correlated 
arrival by subdividing the array into a number of smaller overlapping subarrays and then 
averaging the array correlation matrix obtained from each subarray. The number of sub¬ 
arrays obtained from an array depends on the number of elements used in each subarray. 
For example, using K elements in each subarray, L - (K - 1) subarrays can be formed from 
an array of L elements by forming the first subarray using elements 1 to K, the second 
subarray using elements 2 to K + 1 and so on. The number and size of subarrays are 
determined from the number of directional sources under consideration. For M sources, 
a subarray size of M + 1 and a subarray number greater than or equal to M are necessary 
[Sha85], 

Thus, to estimate the directions of M sources, array size L = 2M is required, which could 
be reduced to 3/2M by using the forward-backward spatial smoothing method [Wil88, 
Pil89]. This process uses the average of the correlation matrix obtained from the forward 
subarray scheme and the correlation matrix obtained from the backward subarray scheme. 

The forward subarray scheme subdivides the array starting from one side of the array 
as discussed above, whereas the backward subarray scheme subdivides the array starting 
from the other side of the array. Thus, in the forward subarray scheme, the first subarray 
is formed using elements 1 to K, whereas in the backward subarray scheme the first 
subarray is formed using elements L to L - (K - 1) and so on. The mth subarray matrix 
R m of the backward method is related to the forward-method matrix R m by 

R m =Jo R IJo (6-13.1) 

where J 0 is a reflection matrix with all its elements along the secondary diagonal being 
equal to unity and zero elsewhere, that is. 
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0 


The process is similar to that used by forward-backward prediction for bearing estimation 
[Lee90a]. 

An improved spatial smoothing method [Du91] uses correlation between all array ele¬ 
ments, rather than correlation between subarray elements as is done in forward-backward 
spatial smoothing method. It estimates a cross-correlation matrix R'"i from subarrays m 
and j, that is. 



(6.13.3) 


with x m (t) and Xj(t), respectively, denoting the array signal vector from the mth and jth 
subarrays. 

The forward subarray matrix R m is then obtained using 


1 mi 

R = — V R m| Ri' 

o j=l 


im 


(6.13.4) 


and R m is obtained by substituting R m from (6.13.4) in (6.13.1). L 0 in (6.13.4) denotes the 
number of subarrays used. 

A method described in [Mog91, Mog92] removes the effects of sensor noise to make 
spatial smoothing more effective in low SNR situations. This spatial filtering method is 
further refined in [Del96] to offer DOA estimates of coherent sources with reduced RMS 
errors. 

A decorrelation analysis of spatial smoothing [Yan92] shows that there exists an upper 
bound on the number of subarrays and the maximum distance between the subarrays 
depends on the fractional bandwidth of the signals. A comprehensive analysis of the use 
of spatial smoothing as a preprocessing technique to weighted ESPRIT and MUSIC meth¬ 
ods of DOA estimation presented in [Rao93] shows how their performance could be 
improved by proper choice of the number of subarrays and weighting matrices. An ESPRIT 
application to estimate the source directions and polarization shows improvement in its 
performance in the presence of coherent arrivals when it is combined with the spatial 
smoothing method [Li93]. 

Spatial smoothing methods using subarray arrangements reduce the effective aperture 
of the array as well as degrees of freedom, and thus more elements are needed to process 
correlated arrivals than would otherwise be required. The schemes that do not reduce 
effective array size include those that restore the structure of the array correlation matrix 
for the linear array to an uncorrelated one. These are referred to as structured methods 
[God90, Tak87], 

Structured methods rely on the fact that for a linear equispaced array, the correlation 
matrix in the absence of correlated arrivals has a Toeplitz structure, that is, the elements 
of the matrix along its diagonals are equal. Correlation between sources destroys this 
structure. In [God90], the structure is restored by averaging the matrix obtained in the 
presence of correlated arrivals by simple averaging along the diagonals as detailed in 
Chapter 5, while in [Tak87] a weighted average is used. A DOA estimation method using 
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the array correlation matrix structured by averaging along its diagonals discussed in 
[Fuc96] appears to offer computational advantages over similar methods. 

Other preprocessing schemes to decorrelate sources include random permutation 
[Lio89], mechanical movement using a circular disk [Lim90], construction of a preprocess¬ 
ing matrix using approximate knowledge of DOA estimate [Wei94a], signal subspace 
transformation in the spatial domain [Par93], unitary transformation method [Hua91], 
and methods based on aperture interpolations [Wei93, Swi89a, Wei95]. 


6.14 Estimating Source Number 

Many high-resolution direction-finding methods require that the number of directional 
sources, and their performance is dependent on perfect knowledge of these numbers. 
Selected methods for estimating the number of these sources are discussed in this section. 

The most commonly referred method for detecting the number of sources was first 
introduced in [Wax85] based on Akaike's information criterion (AIC) [Aka74] and Ris- 
sanen's minimum description length (MDL) [Ris78] principle. The method was further 
analyzed in [Zha89, Wan86] and modified in [Yin87, Won90]. A variation of the method 
that is applicable to coherent sources is discussed in [Wax92, Wax89a, Wax91]. Briefly, the 
method works as follows [Wax85, Wan86]: 

1. Estimate the array correlation matrix from N independent and identically distrib¬ 
uted samples. 

2. Find the L eigenvalues Aj, i = 1,2, ..., L of the correlation matrix such that A, > 

A2 ^ ... ^ Al* 

3. Estimate the number of sources M by solving 


where 


minimize N(L - M) log 


/ 2 ( M) 


h/ 3 (M,N) 


(6.14.1) 


. L 

f (M) = — 1 — Y A, 
JlK ’ L-M 1 


/ 2 ( M) = 


( T ^ 


n 


yi = lVl + l J 


(6.14.2) 


(6.14.3) 


and the penalty function 


fM(2L-M) for AIC 

/ 3 ( M ' N ) = jl M ( 2L - M ) logN for MDL 


(6.14.4) 
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with L denoting the number of elements in the array. 

A modification of the method based on the MDL principle applicable to coherent sources 
is discussed in [Wax89a], which is further refined in [Wax92, Wax91] to improve perfor¬ 
mance. A parametric method that does not require knowledge of eigenvalues of the array 
correlation matrix is discussed in [Wu91]. It has better performance than some other 
methods discussed and is computationally more complex. 

All methods that partition the eigenvalues of the array correlation matrix rely on the 
fact that the M eigenvalues corresponding to M directional sources are larger than the rest 
of the L - M eigenvalues corresponding to the background noise; they also select the 
threshold differently. One of the earliest methods uses a hypothesis-testing procedure 
based on the confidence interval of noise eigenvalues [And63]. Threshold assignment was 
subjective. 

The eigenthreshold method uses a one-step prediction of the threshold for differentiating 
the smallest eigenvalues from the others. This method performs better than AIC and MDL. 
It has a threshold at a lower SNR value than MDL and a lower error rate than AIC at high 
SNRs [Che91]. 

An alternate scheme for estimating the number of sources discussed in [Lee94] uses the 
eigenvectors of the array correlation matrix; in contrast, other methods use the eigenvalues 
of the array correlation matrix. This method, referred to as the eigenvector detection 
technique, is applicable to a cluster of sources whose approximate directions are known, 
and is able to estimate the number of sources at a lower SNR than those by AIC and MDL. 

In practice, the number of sources an array may be able to resolve not only depends on 
the number of elements in the array but also on array geometry, available number of 
snapshots, and spatial distribution of sources. For discussion of these and other issues 
related to array capabilities to uniquely resolve the number of sources, see [Fri91, Wax89, 
Bre86] and references therein. 


6.15 Performance Comparison 

Performance analysis of various direction finding-schemes has been carried out by many 
researchers [Joh82, Rao89, Lee90, Xu93, Pil89a, Sto89, Sto90a, Xu92a, Xu94a, Lee91, Zho91, 
Zho93, Zha95a, Kau95, Kav86, Sto90, Sto91, Ott91, Mat94, Ott92, Vib95, Wei93, Cap70]. 
The performance measures considered for analysis include bias, variance, resolution, 
CRLB, and probability of resolution. In this section, the performance of selected DOA 
estimation schemes is discussed. 

The MUSIC algorithm has been studied in [Lee90, Xu93, Pil89a, Sto89, Sto90a, Xu92a, 
Xu94a, Lee91, Zho91, Zho93, Zha95a, Kau95, Kav86]. Most of these studies concentrate 
on its performance and performance comparisons with other methods when a finite 
number of samples is used for direction finding rather than their ensemble average. 

A rigorous bias analysis of MUSIC shows [Xu92a] that the MUSIC estimates are biased. 
For a linear array in the presence of a single source, the bias increases as the source moves 
away from broadside. Interestingly, the bias also increases as the number of elements 
increases without changing the aperture. An asymptotic analysis of MUSIC with for¬ 
ward-backward spatial smoothing in the presence of correlated arrivals shows that to 
estimate two angles of arrival of equal power under identical conditions, more snapshots 
are required for correlated sources than for uncorrelated sources [Pil89a, Kav86]. 


© 2004 by CRC Press LLC 



Bias and the standard deviation (STD) are complicated functions of the array geometry, 
SNR, and number and directions of sources, and vary inversely proportional to the number 
of snapshots. A poorer estimate generally results using a smaller number of snapshots 
and sources with lower SNR. As shown in [Xu93, Xu92a], the performance of conventional 
MUSIC is poor in the presence of correlated arrivals, and it fails to resolve coherent sources. 

Although the bias and STD both play important roles in direction estimation, the effect 
of bias near the threshold region is critical. A comparison of MUSIC performance with 
those of the minimum-norm and FINE for finite-sample cases shows [Xu94a] that in the 
low SNR range, the minimum-norm estimates have the largest STD and MUSIC estimates 
have the largest bias. These results are dependent on source SNR, and the performance 
of all three schemes approaches to the same limit as the SNR is increased. The overall 
performance of FINE is better than the other two in the absence of correlated arrivals. 

The estimates obtained by MUSIC and ML methods are compared with the CRLB in 
[Sto89, Sto90a] for large-sample cases. The CRLB gives the theoretically lowest value of 
the covariance of an unbiased estimator; it decreases with the number of samples, number 
of sensors in the array, and source SNR [Sto89]. The study [Sto89] concluded that the 
MUSIC estimates are the large-sample realization of ML estimates in the presence of uncor¬ 
related arrivals. Lurthermore, it shows that the variance of the MUSIC estimate is greater 
than that of the ML estimate, and variance of the two methods approchaes each other, as the 
number of elements and snapshots increases. Thus, using an array with a large number of 
elements and samples, excellent estimates are possible of directions of uncorrelated sources 
with large SNRs using the MUSIC method [Sto89]. It should be noted that MLM estimates 
are unbiased [Vib95]. An unbiased estimate is also referred to as a consistent estimate. 

An improvement in MUSIC DOA estimation is possible by beamspace MUSIC [Lee90, 
Xu93]. By properly selecting a beamforming matrix and then using the MUSIC scheme to 
estimate DOA, one is able to reduce the threshold level of the required SNR to resolve 
the closely spaced sources [Lee90]. Although the variance of this estimate is not much 
different from the element space case, it has less bias [Xu93]. The resolution threshold of 
beamspace MUSIC is lower than the conventional minimum-norm method. However, for 
two closely spaced sources, the beamspace MUSIC and beamspace minimum-norm pro¬ 
vide identical performances when suitable beamforming matrices are selected [Lee90]. 

As shown in [Kau95], when beamforming weights have conjugate symmetry (useful 
only for arrays with particular symmetry), the beamspace MUSIC has decorrelation prop¬ 
erties similar to backward-forward smoothing and thus is useful for estimation of corre¬ 
lated arrival source direction and offers performance advantages in terms of lower 
variance for the estimated angle. 

The resolution property of MUSIC analyzed in [Lee91, Zho91, Zho93, Zha95a, Kav86] 
shows how it depends on SNR, number of snapshots, array geometry, and separation 
angle of the two sources. The two closely spaced sources are said to be resolved when 
two peaks in the spectrum appear in the vicinity of the two sources' directions. Analytical 
expressions of resolution probability and its variation as a function of various parameters 
are presented in [Zha95a], and could be used to predict the behavior of a MUSIC estimate 
for a given scenario. 

A performance comparison of MUSIC and another eigenvector method, which uses 
noise eigenvectors divided by corresponding eigenvalues for DOA estimation, indicates 
[Joh82] that the former is more sensitive to the choice of assumed number of sources 
compared to actual number of sources. 

Performance analysis of many versions of ESPRIT are considered in [Rao89, Sto91a, 
Ott91, Mat94a] and compared with other methods. Estimates obtained by subspace rota¬ 
tion methods that include the Toeplitz approximation method (TAM) and ESPRIT have 
greater variance than those obtained by MUSIC using large numbers of samples [Sto91a]; 
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estimates by ESPRIT using a uniform circular array are asymptotically unbiased [Mat94a]; 
LS-ESPRIT and TAM estimates are statistically equivalent; LS-ESPRIT and TLS-ESPRIT 
have the same MSE [Rao89] and their performance depends on how subarrays are selected 
[Ott91]; the minimum-norm method is equivalent to TLS-ESPRIT [Dow91]; and root- 
MUSIC outperforms the ESPRIT [Rao89a]. TAM is based on the state space model, and 
finds DOA estimates from signal subspace. In spirit, its approach is similar to ESPRIT 
[Rao89]. The WSF and ML methods are efficient for Gaussian signals, as both attain CRLB 
asymptotically [Sto90b, Ott92]. A method is said to be efficient when it achieves CRLB. 

A correlation between sources affects the capabilities of various DOA estimation algo¬ 
rithms differently [DeG85]. A study of the effect of the correlation between two sources 
on the accuracy of DOA-finding schemes presented in [Wei93b] shows that the correlation 
phase is more significant than correlation magnitude. Most performance analysis dis¬ 
cussed assumes that the background noise is white. When this is not the case, the DOA 
schemes perform differently. In the presence of colored background noise, MUSIC perfor¬ 
mance is better than that of ESPRIT and the minimum-norm method over a wide range 
of SNRs. The performance of the minimum-norm method is worse than MUSIC and 
ESPRIT [Li92], 


6.16 Sensitivity Analysis 

Sensitivity analysis of MUSIC to various perturbations is presented in [Swi92, Rad94, 
Fri94, Wei94, Ng95, Soo92]. A compact expression for the error covariance of the MUSIC 
estimates given in [Swi92] may be used to evaluate the effect of various perturbation 
parameters including gain and phase errors, effect of mutual coupling, channel errors, 
and random perturbations in sensor locations. It should be noted that MUSIC estimates 
of DOA require knowledge of the number of sources, similar to certain other methods and 
underestimation of the source number may lead to inaccurate estimates of DOAs [Rad94]. 
A variance expression for the DOA estimate for this case has been provided in [Rad94]. 

Analysis of the effect of model errors on the MUSIC resolution threshold [Fri90, Wei94] 
and on the wave forms estimated using MUSIC [Fri94] indicate that the probability of 
resolution decreases [Wei94] with the error variance, and that the sensitivity to phase 
errors depends more on array aperture than the number of elements [Fri94] in a linear 
array. The effect of gain and phase error on the mean square error (MSE) of the MUSIC 
estimate of a general array is analyzed in [Sri91]. The problem of estimating gain and 
phase errors of sensors with known locations is considered in [Ng95]. 

An analysis [Soo92] of ESPRIT under random sensor uncertainties suggests that the 
MUSIC estimates generally give lower MSEs than ESPRIT estimates. The former is more 
sensitive to both sensor gain and phase errors, whereas the latter depends only on phase 
errors. The study further suggests that for a linear array with a large number of elements, 
the MSE of the ESPRIT estimate with maximum overlapping subarrays is lower than 
nonoverlapping subarrays. 

The effect of gain and phase errors on weighted eigenspace methods including MUSIC, 
minimum-norm, FINE, and CLOSEST is studied in [Ham95] by deriving bias and variance 
expressions. This study indicates that the effect is gradual up to a point and then the 
increase in error magnitude causes the abrupt deterioration in bias and variance. The 
weighted eigenspace methods differ from the standard ones such that a weighting matrix 
is used in the estimate, and that matrix could be optimized to improve the quality of the 
estimate under particular perturbation conditions. 
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The effect of nonlinearity in the system on spectral estimation methods, including hard 
clipping common in digital beamformers, has been analyzed in [Tut81]. It shows that by 
using additional preprocessing such distortions could be eliminated. 

Effects of various perturbations on DOA estimation methods emphasize the importance 
of precise knowledge of various array parameters. Selected techniques to calibrate arrays 
are discussed in [Wei89, Wyl94]. Schemes to estimate the steering vector, and in turn DOA 
from uncalibrated arrays, are discussed in [Tse95]. [Che91a] focus on a scheme to estimate 
DOA. Discussions on robustness issues of direction-finding algorithms are found in [Fli94, 
Wei90]. A summary of performance and sensitivity comparisons of various DOA estima¬ 
tion schemes is provided in Tables 6.1 to Table 6.12 [God97]. 


TABLE 6.1 


Performance Summary of Bartlett Method 


Property 

Comments and Comparison 

Bias 

Biased 


Bartlett > LP > MLM 

Resolution 

Depends on array aperture 

Sensitivity 

Robust to element position errors 

Array 

General array 


TABLE 6.2 


Performance Summary of MVDR Method 


Property 

Comments and Comparison 

Bias 

Variance 

Resolution 

Array 

Unbiased 

Minimum 

MVDR > Bartlett 

Does not have best resolution of any method 
General array 

TABLE 6.3 


Performance Summary of Maximum Entropy Method 

Property 

Comments and Comparison 

Bias 

Resolution 

Biased 

ME > MVDR > Bartlett 

Can resolve at lower SNR than Bartlett 

TABLE 6.4 


Performance Summary of Linear Prediction Method 

Property 

Comments and Comparison 

Bias 

Resolution 

Performance 

Biased 

LP > MVDR 

>Bartlett 

>ME 

Good in low SNR conditions 

Applicable for correlated arrivals 
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TABLE 6.5 


Performance Summary of ML Method 


Property 

Comments and Comparison 

Bias 

Unbiased 

Less than LP, Bartlett, MUSIC 

Variance 

Less than MUSIC for small samples 

Asymptotically efficient for random signals 

Not efficient for finite samples 

Less efficient for deterministic signals than random signals 
Asymptotically efficient for deterministic signals using very large array 

Computation 

Intensive with large samples 

Performance 

Same for deterministic and random signals for large arrays 

Applicable for correlated arrivals 

Works with one sample 


TABLE 6.6 

Performance Summary of Element Space MUSIC Method 

Property 

Comments and Comparison 

Bias 

Biased 

Variance 

Less than ESPRIT and TAM for large samples, minimum norm 

Close to MLM, CLOSEST, FINE 

Variance of weighted MUSIC is more than unweighted MUSIC 
Asymptotically efficient for large array 

Resolution 

Limited by bias 

Array 

Applicable for general array 

Increasing aperture makes it robust 

Performance 

Fails to resolve correlated sources 

Computation 

Intensive 

Sensitivity 

Array calibration is critical, sensitivity to phase error depends more on array 
aperture than number of elements, preprocessing can improve resolution 
Correct estimate of source number is important 

MSE depends on both gain and phase errors and is lower than for ESPRIT 
Increase in gain and phase errors beyond certain value causes an abrupt 
deterioration in bias and variance 


TABLE 6.7 

Performance Summary of Beam Space MUSIC Method 

Property 

Comments and Comparison 

Bias 

Less than element space MUSIC 

Variance 

Larger than element space MUSIC 

RMS Error 

Less than ESPRIT, minimum norm 

Resolution 

Similar to beamspace minimum norm, CLOSEST 

Better than element space MUSIC, element space minimum norm 
Threshold SNR decreases as the separation between the sources increases 

Computation 

Less than element space MUSIC 

Sensitivity 

More robust than element space MUSIC 
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TABLE 6.8 


Performance Summary of Root-MUSIC Method 

Property 

Comments and Comparison 

Variance 
Resolution 
RMS error 
Array 

Performance 

Less than root minimum norm, ESPRIT 

Beamspace root-MUSIC has better probability of resolution than beamspace MUSIC 
Less than LS ESPRIT 

Equispaced linear array 

Better than spectral MUSIC 

Similar to TLS ESPRIT at SNR lower than MUSIC threshold 

Beamspace root-MUSIC is similar to element space root MUSIC 


TABLE 6.9 


Performance Summary of Minimum Norm Method 


Property 

Comments and Comparison 

Bias 

Resolution 

Method 

Less than MUSIC 

Better than CLOSEST, element space MUSIC 
Equivalent to TLS 


TABLE 6.10 

Performance Summary of CLOSEST Method 

Property 

Comments and Comparison 

Variance 

Similar to element space MUSIC 

Resolution 

Similar to beamspace MUSIC 

Better than minimum norm 

Performance 

Good in clustered situation 

Sensitivity 

An increase in sensor gain and phase errors beyond certain 
value causes an abrupt deterioration in bias and variance 


TABLE 6.11 


Performance Summary of ESPRIT Method 

Property Comments and Comparison 


Bias 

RMS Error 

Variance 

Computation 

Method 

Array 

Performance 

Sensitivity 


TLS ESPRIT unbiased 
LS ESPRIT biased 
Less than minimum norm 
TLS similar to LS 

Less than MUSIC for large samples and difference increases with number of elements in array 
Less than MUSIC 

Beam space ESPRIT needs less computation than beamspace root-MUSIC and ES ESPRIT 

LS ESPRIT is similar to TAM 

Needs doublets, no calibration needed 

Optimum-weighted ESPRIT is better than uniform-weighted ESPRIT 
TLS ESPRIT is better than LS ESPRIT 

More robust than MUSIC and cannot handle correlated sources 
MSE robust for sensor gain errors 

MSE is lowest for maximum overlapping subarrays under sensor perturbation 
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TABLE 6.12 


Performance Summary of FINE Method 


Property 

Comments and Comparison 

Bias 

Less than MUSIC 

Resolution 

Better than MUSIC and minimum norm 

Variance 

Less than minimum norm 

Performance 

Good at low SNR 
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Notation and Abbreviations 


AIC Akaike's information criterion 

CANAL concurrent nulling and location 

CRLB Cramer-Rao lower bound 

DOA direction of arrival 

ESPRIT estimation of signal parameters via rotational invariance technique 

FINE first principal vector 

LMS least mean square 

LP linear prediction 

LS least square 

MAP maximum a posteriori 

MDL minimum description length 

ME maximum entropy 

ML maximum likelihood 

MLM maximum likelihood method 

MSE mean square error 

MVDR minimum variance distortionless response 

MUSIC multiple signal classification 

SNR signal-to-noise ratio 

STD standard deviation 

TAM Toeplitz approximation method 

TLS total least square 

ULA uniformly spaced linear array 

WSF weighted subspace fitting 
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A L x M matrix with columns being steering vectors 

d interelement spacing of linear equispaced array 

E[ ] expectation operator 

e, vector of all zeros except first element, which is equal to unity 

f N Nyquist frequency 

H(s) entropy function 

Jo reflection matrix with all elements along secondary diagonal being equal to 

unity and zero elsewhere 

K number of elements in subarray 

L number of elements in array 

L 0 number of subarrays 

M number of directional sources 

N number of samples 

P B (0) power estimated by Bartlette method as function of 0 

PuJO) power estimated by linear prediction method as function of 0 

Pme(0) power estimated by maximum entropy method as function of 0 

P MK (0) power estimated by minimum norm method as function of 0 

P MU (0) power estimated by MUSIC method as function of 0 

P MV (0) power estimated by MVDR method as function of 0 

R array correlation matrix 

R m mth subarray matrix of forward method 

R m mth subarray matrix of backward method 

Rmj cross correlation matrix of mth and jth subarrays 

rjj correlation between the ith and the jth elements 

S e steering vector associated with the direction 0 

S (f) power spectral density of signal s(t) 

s(t) vector of M source signals induced on reference element 

T transformation matrix 

U N matrix with its L - M columns being the eigenvectors corresponding to the 

L - M smallest eigenvalues of R 

U s matrix with M columns being eigenvectors corresponding to M largest 

eigenvalues 

U | column vector of all zeros except one element that is equal to unity 

w array weight vector 

w optimized array weights 

<t> diagonal matrix defined by (6.10.4) 

A 0 magnitude of displacement vector 

A diagonal matrix defined by (6.10.11) 

0 direction of source 

Xjj(0) differential delay between elements i and j due to source in direction 0 

\\i transformation matrix 
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Single-Antenna System in Fading Channels 
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7.3.4 Error Rate Performance 
Notation and Abbreviations 
References 

In previous chapters, it is assumed that the directional signals arrive from point sources as 
plane wave fronts. In mobile communication channels, the received signal is a combination 
of many components arriving from various directions as a result of multipath propagation. 
Depending on terrain conditions and local buildings and structures, the power of the 
received signal fluctuates randomly as a function of distance. Fluctuations on the order of 
20 dB are common within the distance of one wavelength. This phenomenon is called fading. 

In this chapter, a brief review of fading channels is presented with a view to introduce 
notation and to develop mathematical equations to be used for analyzing the behavior of 
communication systems. The chapter also contains analyses of a single antenna system 
under various fading conditions. The methodology presented in this chapter would be 
helpful in analyzing the performance of various diversity-combining schemes discussed 
in Chapter 8, and results would serve as a reference for comparison. 

A detailed treatment of fading channels is presented in [Skl02]. For an introduction to 
mobile communications, see [God02]. For details on digital communications and the 
required probability theory, see [Pro95]. 


7.1 Fading Channels 

Let a transmitted signal s(t) be expressed in complex notation as 



(7.1.1) 


© 2004 by CRC Press LLC 







where Re[.] denotes the real part of a complex quantity, f c is the carrier frequency, and g(t) 
is the complex envelope of s(t) that can be expressed in the magnitude and phase form as 

g(t) = |g(t)|e^ t> (7.1.2) 

where |g(t)| is the magnitude and (])(t) is the phase of the complex baseband wave form. 

For the frequency and phase-modulated signals, |g(t)| is constant. Without any loss of 
generality, it is assumed in the present discussion that it is equal to unity. 

In the mobile communications environment, the transmitted signal undergoes fading. 
There are two kinds of fading, namely large-scale and small-scale. 

Large-scale fading, also known as shadowing, is caused by hills and large buildings. It 
determines the local mean signal power at distance R from the transmitter. Let S denote 
this power. It is a random quantity and the random variable S has a log-normal distribu¬ 
tion. 

Let S d denote the mean signal power in decibels. Thus, S d and S are related by 

S d = 10 log S (7.1.3) 

where log denotes log 10 (.). The random variable (RV) S d has a normal distribution. 

Small-scale fading, on the other hand, is a local phenomenon caused by multipath 
propagation. It causes in a rapid fluctuation of the signal around the slowly varying local 
mean. 

Let x 0 (t) denote the signal component induced on an antenna. It is given by 

x 0 (t) = Re[x 0 (t)] (7.1.4) 

where x 0 (t) is the signal component in the complex form. Following (7.1.1) and (7.1.2) x 0 (t) 
can be expressed as 


x 0 (t) = r(t) e i6(t ^ e’ 2ltfct e i4>(t ^ (7.1.5) 

In the above equation, the complex random quantity r(t)ei 0 W accounts for channel fading 
with r(t) denoting the signal amplitude and 0(t) representing the random phase process 
uniformly distributed in [0,271). 

It is convenient to think r(t) as a product of two variables, that is, 

r(t) = m(t)r 0 (t) (7.1.6) 

where m(t) is a slowly varying quantity and denotes the local mean value of the signal. 
It accounts for large-scale fading and the effect of shadowing, and determines the local 
mean power S given by 


S = m 2 (t) (7.1.7) 

The complex quantity r o (t)ei 0 W is the result of small-scale fading and causes rapid fluctu¬ 
ations about the local mean signal m(t). 
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7.1.1 Large-Scale Fading 

In free-space propagation, the received signal power P R at distance R from the transmitter 
and the transmitted power P x of an isotropic source are linked by the well-known relation 


P = ^- 
R 4ttR 2 


(7.1.8) 


Let Po denote the received signal power at a reference distance R 0 from the transmitter, 
that is. 


P = ^- 
R ° 4ttR 2 


(7.1.9) 


It follows from (7.1.8) and (7.1.9) that the received powers P R and P R are related by 


P =P 

x P A P 


Ro 
°l R 


(7.1.10) 


In mobile radio channels, the mean path loss between a transmitter and a receiver is 
proportional to the nth power of distance R relative to a reference distance R 0 rather than 
2, as is the case for free-space propagation. In urban areas, a typical value of the path loss 
exponent n is four. Denoting the received signal power at distance R 0 from the transmitter 
by S(R 0 ), and the received signal power at a distance R from the transmitter by S(R), it 
follows from (7.1.10) that for mobile radio channels. 


S(R) = S(R 0 )^J (7.1.11) 

Let S d denote S(R) in decibels, that is, 

S d = 101ogS(R) (7.1.12) 

Substituting for S(R) from (7.1.11) in (7.1.12) leads to 


S d =101ogS(R 0 ) + 10nlog^j (7.1.13) 

Note that the signal power S(R) received at R meters away from the transmitter is an 
average value and is referred to as the area mean. Thus, S d is the area mean signal power 
in decibels. It is different from the mean signal power that we previously also referred to 
as the local mean signal power and denoted by S (and S d in decibels). The relationship 
between the two is now described. 

The mean signal power S d is site dependent and for a given transmitter-receiver sepa¬ 
ration, it differs from location to location due to the shadowing effect. It is a random 
quantity with a normal distribution, and this randomness is reflected by adding a random 
quantity to the area mean power S d to yield an expression for the received mean power 
in decibels, yielding 
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(7.1.14) 


S =S.+X„ 

d d o s 

where X 0 is a zero-mean, Gaussian random variable (in decibels) with standard deviation 
o s (also in decibels). The parameter o s is called the decibel spread and is a site-dependent 
quantity. It may take on values between 6 to 12 dB depending on the severity of shadowing 
[Fre79]. 

It follows from (7.1.4) that S d is a random variable with a mean value equal to the area 
mean S d , that is. 


S d = E[S d ] (7.1.15) 

Thus, S d is a random variable having normal distribution with the mean value equal to 
S d and the standard deviation equal to o s . An expression for its probability density function 
(P df ) f Sd (Sd) is g iven b y [Yeh84]: 


Us„)=-, 


2n a 


exp 


(s d -s d ) 2 

2<t 2 


(7.1.16) 


Due to the fact that the log value of S has a normal distribution, S is said to have a log¬ 
normal distribution. 

Note that the cumulative distribution function (cdf) F x of an RV x is related to its pdf by 


«(y)= f Yf x( x ) d N 

Jo 


or alternately. 


f x(y) = 


dF x(y) 

d y 


(7.1.17) 


(7.1.18) 


It follows from (7.1.16) and (7.1.17) that 


f y 1 

(s d -s d ) 2 

1 i — exp 

Jo V2 tc <7 s 

2^ 


dS. 


(7.1.19) 


Since 


S d = 10 log S 


= 10 


InS 
In 10 


a differentiation on both sides with respect to S results in 


dS d = 


10 dS 
In 10 S 


(7.1.20) 


(7.1.21) 
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By substituting for S d and dS d , it follows from (7.1.19) that the cdf of S is given by 


3( Z ) = J 

Jo 


r 10 

(l01ogS-S d ) 2 

Jo V2no s SlnlO ^ 

2° s 2 


dS 


This along with (7.1.18) implies that the pdf of S can be expressed as 


th) 


10 

J2k a S In 10 


exp 


(l01ogS-S d ) 2 


2o 


(7.1.22) 


(7.1.23) 


7.1.2 Small-Scale Fading 

For the discussion on small-scale fading, assume that the large-scale fading component 
m(t) and thus the local mean signal power S remain constant. This would be the case 
when the receiving antenna remains within a limited trajectory such that shadowing effects 
may be ignored. 

Under the assumption of m(t) being constant, it follows from (7.1.6) that the quantity 
r(t)ei 0 T may be thought of as representing the small-scale fading effect similar to r 0 (t)ei tl(tl . 
This quantity is the resultant sum of many scattered multipath components of varying 
amplitude and phase arriving at the receiving antenna. Denote this in terms of its orthog¬ 
onal components a(t) and b(t), that is, 

r(t)e i9( 'W(t) + jb(t) (7.1.24) 

The variables a(t) and b(t) result from the addition of many multipath components. 
When the number of such components is large, these variables at a given time are statis¬ 
tically independent, Gaussian random variables with a zero mean and equal variance a 2 . 
Dropping the reference to time for ease of notation, one thus writes expressions for pdfs 
of a and b as 


f a( z ) = f b( z ) = ^— ex P 


2 no 


2 a 1 


(7.1.25) 


Writing an expression for the signal envelope from (7.1.24) as 


r(t) = 7a 2 (t) + b 2 (t) (7.1.26) 

it follows that r(t) > 0 for all t. Furthermore, at a given time, it is a RV with 

E[r 2 ] = E[a 2 + b 2 ] 

= E[a 2 ] + E[b 2 ] (7.1.27) 

= 2o 2 


© 2004 by CRC Press LLC 



Next, an expression for the pdf of r is derived. Consider 


N 

y =I>' 

i = l 


(7.1.28) 


where x ir i = 1, 2, ..., N are statistically independent, Gaussian random variables with a 
zero mean and equal variance a 2 . The pdf of Y is given by [Pro95] 


f y(y) a N 2 N/2 r ( N / 2 ) y 1 ex P 


y 

2o 2 


, y>o 


(7.1.29) 


where T(p) is the gamma function, defined as 

r(p)= f t p_1 e“* dt, p>0 

Jo 

It has the following properties [Abr72]: 

r (p) = (p-l)!, p an integer, p > 0 

r (p)r(i- P ) = -pr(p)r(-p) = sin ^ p , o< P <i 


(7.1.30) 


(7.1.31) 


and 


'i \ _ ( qA 1 

r|- r - =-Vtt 

,21 V 27 2 


(7.1.32) 


The pdf given by (7.1.29) is called a chi-square or gamma pdf with N degrees of freedom. 
In the present case. 


Y = r 2 = a 2 +b 2 


Thus, N = 2 and the pdf of Y = r 2 becomes 


f Y ( y ) = 2 ^2 exp 


2o 


, y>o 


(7.1.33) 


(7.1.34) 


Since the cdf of a RV x is related to its pdf by (7.1.17), it follows from (7.1.34) and (7.1.17) 
that the cdf of Y, F Y , is given by 


r l 

y 

Jo 2c? exp 

2o 2 


dy 


(7.1.35) 


Define 


x = y 1/2 


(7.1.36) 
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Thus, 


y = x 2 (7.1.37) 

and 

dy = 2xdx (7.1.38) 

Substituting from (7.1.37) and (7.1.38) in (7.1.35) and using r = Y 1/2 leads to the cdf of r 
given by 


F_(r) = 


f r x 

x 2 ' 

~2 eX P 

JO G 

2o 2 _ 


dx 


This along with the relation (7.1.18) yields 


(7.1.39) 


f(r) 



r > 0 


(7.1.40) 


This is the pdf of a Rayleigh-distributed RV. Thus, r is an RV with Rayleigh distribution. 

When the received signal has a significant nonfading component (line-of-sight compo¬ 
nent) other than the reflective multipath component, the received signal envelope has a 
Rice distribution with pdf given by [Rap96] 



r 

r 2 + A 2 " 

f r( r ) = ‘ 

a 2 CXP 
0 

2o 2 



r > 0, A > 0 
otherwise 


(7.1.41) 


where A denotes the peak amplitude of the line-of-sight (LOS) component and I 0 () is the 
modified Bessel function of the first kind and zero order. The Rice distribution is often 
characterized by a parameter K 0 defined as 


K 0 =^ (7.1.42) 


As the magnitude of the LOS component A goes to zero, the Rice pdf approaches the 
Rayleigh pdf given by (7.1.40). 

Another distribution that is frequently used to describe the statistics of signals trans¬ 
mitted through multipath fading channels is the Nakagami m-distribution with pdf given 
by [Pro95]: 


where 


f(r) 


2 

r(m) {a 


r 2m_1 e 


mr 


(7.1.43) 



(7.1.44) 
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The fading parameter m is defined as 


m = 




(7.1.45) 


and T(m) is the gamma function. 

Note that when m = 1, (7.1.43) reduces to the Rayleigh pdf given by (7.1.40). For values 
of m in the range of 1/2 < m < 1, (7.1.43) results in pdfs having larger tails than a Rayleigh 
pdf, whereas for m > 1 the tails of this pdf decay faster than the Rayleigh pdf. The term 
m = °° denotes no fading. 

The discussion presented thus far relates to distribution of the received signal amplitude. 
Now the distribution of signal power is considered. 


7.1.3 Distribution of Signal Power 

The instantaneous power S of the received signal x 0 (t) is given by 


S = x o(t) (7.1.46) 

It follows from (7.1.4), (7.1.5) and (7.1.46) that the expression for the instantaneous power 
averaged over one radio frequency cycle reduces to 

S=y (7.1.47) 

Equation (7.1.47) along with (7.1.27) implies that the local mean signal power S is given by 


S = E[S] 


(7.1.48) 


Now, an expression for fg, the pdf of S is derived for the case when the signal amplitude 
has the Rayleigh distribution given by (7.1.40). For this case, the cdf of r, F„ is given by 
(7.1.39). 

To transform f r into f s define a new variable: 


Thus, 



(7.1.49) 


dy = xdx 


(7.1.50) 


Using (7.1.47) to (7.1.50) in (7.1.39), the cdf of S is given by 




(7.1.51) 
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This along with the relation 


4(s)=d ds^ (7 ' 1 ' 52) 

yields 

4(S) = = e"* (7.1.53) 

Similarly, the pdf of S in a Nakagami and Rice fading environment may be obtained. 

For the case of Nakagami distributed environment, S is a gamma-distributed RV with 
pdf f s (S) given by [Abu91, Woj 86 ]: 



7.2 Channel Gain 

In this section, the concept of channel gain, sometimes also referred to as channel atten¬ 
uation [Pro95] (to be used later in the book), is introduced and selected signal and power 
variables are expressed using the channel gain. Define a real variable 

«(t) = ^ 7 = (7.2.1) 

Vs 

to denote the signal envelope normalized with respect to the square root of the mean 
signal power S and a complex variable C(t) 

C(t) = oc(t)e’ 0(t) (7.2.2) 

to denote the channel gain. 

Thus, the received signal in the complex form given by (7.1.5) can be expressed as 

5c 0 (t) = C(t)VSg(t)e* 2 ^ (7.2.3) 

where g(t) is the transmitted signal normalized to have a unit energy, that is, 

f T [g(t)f dt = l (7.2.4) 

~ 0 

Assume that the channel varies slowly such that a(t) and 0(t) may be regarded as 
constant over a time duration T of interest, such as a bit or symbol duration. Thus, over 
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this time the channel gain can be regarded as constant and the reference to time t may be 
dropped, yielding 


x 0 (t) = cVSg(t)e i2 ^, 0 < t < T 


(7.2.5) 


The instantaneous signal power S averaged over time T is then given by 


a 2 S 

2 


(7.2.6) 


In view of (7.2.1) and the fact that the mean signal power is constant for a given large- 
scale fading, the channel gain is a complex RV and has the same statistics as r(t)ei e W. 


7.3 Single-Antenna System 

In this section, a single-channel system is considered, and its performance is examined by 
studying the outage probability and the average bit error rate (BER) in the presence of 
frequency nonselective slow-fading channels. Both noise-limited and interference-limited 
systems are considered. The methodology to evaluate outage probability and average BER 
presented here should be helpful in evaluating these parameters for various diversity 
schemes discussed in Chapter 8. 

7.3.1 Noise-Limited System 

In a noise-limited system, system performance is limited by noise and the effect of co¬ 
channel interference is negligible. Consider a noise-limited system with a single source in 
the presence of an additive white Gaussian noise (AWGN) of zero mean and variance N. 
Let y denote the instantaneous signal power to the mean noise power ratio, that is. 



As discussed in Section 7.1, S is an RV and thus y is an RV. Let T denote its mean value, 
that is. 


r = e(y) 


(7.3.2) 


Substituting from (7.3.1) in (7.3.2) and noting that S denotes the mean value of S, 



(7.3.3) 


For a receiver to function properly in a noise-limited system the SNR at its input must 
be above a certain threshold y 0 . When SNR drops below this threshold, the communication 
link does not remain operational. The probability of this happening is referred to as the 
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outage probability, denoted by P°. Denoting P[x] as the probability of an event x, P° may 
be expressed as 


P° = P[y<y 0 ; 


?- 

*-? 

II 

(7.3.4) 



where f y and F y , respectively, denote the pdf and cdf of y. 

It follows from (7.3.4) that evaluation of outage probability requires knowledge of the 
pdf or the cdf of y, which depends on the fading environment. The pdf of y is also useful 
in determining the average BER in the fading environment. When the conditional prob¬ 
ability of bit error for a given value of the SNR, P e (y), is known, the average BER, P e , may 
be obtained by averaging over y, that is. 

P e = f Pe(y) f y(y) d T 

Jo 

(7.3.5) 

7.3.1.1 Rayleigh Fading Environment 


In this case, the signal amplitude has a Rayleigh distribution. First, 
cdf of y. The pdf of S is given by (7.1.53), that is. 

we derive the pdf and 

1 -t 

f s (S) = f e S 

(7.3.6) 

It follows from (7.1.17) and (7.3.6) that the cdf of S is given as 


f s i _* 

F s (S) = J oI e§dx 

(7.3.7) 

Define a new variable: 


X 

y ~ N 

(7.3.8) 

Thus, 


£ 

ii 

X 

(7.3.9) 

and 


dx = Ndy 

(7.3.10) 


Substituting from (7.3.9) and (7.3.10) in (7.3.7) and using (7.3.1) and (7.3.3), 


ry i _Z 

F dr)=I F e rdy 

= 1-e r 


(7.3.11) 
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Equations (7.1.17) and (7.3.11) imply that 


f y (Y) = ^e"r (7.3.12) 

Using (7.3.4), the outage probability then becomes 

P° = F>.) 

(7.3.13) 

To 

= l-e r 


7.3.1.2 Nakagami Fading Environment 

For this case, the pdf of S is given by (7.1.54). Following the procedure of the previous 
section, the following expression for the pdf of y is obtained: 


f Y (y) 


m] n y m_1 
rj r(m) 


my 

e“^ 


The cumulative distribution function of y then is given by 


Defining 



y = mx 


and noting that for integer m, T(m) = (m - 1)!, (7.3.15) becomes 


yielding 


f t (y) 


1 

r m (m-l)! 



_y 

e r dy 


= 1 



f (my/rf- 

h 


P°=F,W 


(7.3.14) 


(7.3.15) 


(7.3.16) 


(7.3.17) 


(7.3.18) 


7.3.2 Interference-Limited System 

In an interference-limited system, system performance is limited by total interference 
power and not noise power, as is the case for noise-limited systems. In this case, the effect 
of noise is negligible and thus is ignored. 
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In this section, the effect of co-channel interference is examined by deriving the expression 
for the probability of signal-to-interference ratio (i being less than the desired threshold 
value go ' n a Nakagami fading environment [Abu91]. Rayleigh fading is treated as a special 
case. 

Assume that the received signal amplitude is an RV with a Nakagami distribution. Then 
the signal power S is a gamma-distributed RV with pdf given by (7.1.54). 

Assume that there are K co-channel interferences present. Let tp i = 1, 2, ..., K be i.i.d. 
RVs with Nakagami distribution, and denote the amplitude of these interferences. Let Ij, 
i = 1, 2, ..., K denote their instantaneous powers, that is, 

Ij — . i = l,2,...,K (7.3.19) 

Let xip i = 1, 2, ..., K be i.i.d. random-phase processes associated with K interferences. 
It is assumed that r, 0, q ; and i = 1,2, ..., K are mutually independent. Note that r and 
0 denote the amplitude and phase of the signal, respectively. 

The interference power Ij is a gamma-distributed RV with pdf given by (following 7.1.54) 
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(7.3.20) 


where mj is the fading parameter and Ij is the mean power of the ith interference. 


7.3.2.1 Identical Interferences 

Consider that all interferences have identical statistics with the fading parameter denoted 
by m (same as the signal) and equal mean power denoted by I. As interferences are 
independent, the total interference power I is given by 

K 

I = ^L (7.3.21) 

i=l 

I is a gamma-distributed RV with pdf given by [Abu91, Fel66] 
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(7.3.22) 


Note that I denotes the total interference power and I denotes the mean power of each 
interference. 

Assume that the channel becomes inoperable when the ratio of signal power S to total 
interference power I becomes less than some desired value p 0 . Thus, the outage probability 
with K interference I^ K can be written as 




= P[S<p 0 l] 


(7.3.23) 
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This could be solved by letting S = x, finding the probability that (i () I > x, and integrating 
over x. Thus, (7.3.23) becomes [Fre79] 
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it follows from (7.3.24) that 
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(7.3.24) 

(7.3.25) 

(7.3.26) 


(7.3.27) 


Substituting for f s and f, from (7.1.54) and (7.3.22), respectively, and evaluating the 
integral [Abu91], 

P°=I x (m,mK) m > 0.5 (7.3.28) 

where 

x = ^— (7.3.29) 

1 + ^ 
bo 

p denotes the average signal power to the average power of a single interference ratio, 
that is, 

b = f (7-3.30) 

and I x (m,mK) is the incomplete beta function, given by 
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For a special case where m is an integer, (7.3.28) reduces to 
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(7.3.31) 
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For the Rayleigh fading environment, m = 1 and I*K becomes 


P° = 


ho 


K-l f _ 

h 


which for a single interference case (K = 1) reduces to 


P° = 


h + ho 


(7.3.34) 


(7.3.35) 


Thus, (7.3.35) gives the outage probability in the presence of one interference with the 
same statistics as the signal in the Rayleigh fading environment. 


7.3.2.2 Signal and Interference with Different Statistics 

Consider the case of an interference with fading statistics different from those of the signal 
[Abu91]. Assume that only one interference exists. Let q, denote the amplitude of the 
interference with Nakagami distribution of parameter rrq, whereas the signal amplitude 
is assumed to be Nakagami distributed with parameter m. The interference power 


1 = 


q? 


(7.3.36) 


is a gamma-distributed RV with pdf given by 
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where I denotes the mean interference power. Substituting for f s and f, from (7.1.54) and 
(7.3.37), respectively, in (7.3.27), and evaluating the integrals for integer values of m and m 1; 
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(7.3.38) 
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For details on (7.3.38) and results when there is more than one interference with different 
statistics, see [Abu91]. 


7.3.3 Interference with Nakagami Fading and Shadowing 

In this section, the analysis is extended to a scenario where both desired signal and 
interference experience Nakagami fading in the presence of log-normal shadowing 
[Abu91, Fre79]. The analysis in the previous section was without shadowing, and thus 
the mean values of signals and interferences were assumed to be constant. In the presence 
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of shadowing, the mean signal value S and the mean interference value I are log-normal 
distributed random variables with respective pdfs given by 


and 


where 


and 
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(7.3.39) 
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(7.3.40) 


; d = ioiogs 

(7.3.41) 

t d =10 log I 

(7.3.42) 

S d = E[S d ] 

(7.3.43) 

I d = E[l d ] 

(7.3.44) 


and o s and 0 t are decibel spread parameters for signal and interference, respectively. 

As discussed in previous sections, calculation of the outage probability requires uncon¬ 
ditional pdfs of the signal power and interference power, f s and f,, respectively. These may 
be obtained by combining the pdfs of the mean signal power and mean interference power 
given by (7.3.39) and (7.3.40), respectively, with the corresponding conditional pdfs of the 
signal power and interference power given by (7.1.54) and (7.3.37), respectively. Note that 
the pdfs given by (7.1.54) and (7.3.37) are conditional that the mean signal power and 
mean interference power are constant. Denoting the conditional pdf of the signal power 
given by (7.1.54) as f s/Sd , it follows that 


f s (S) = j4/s d (S)f Stl (S d )dS d (7.3.45) 

Similarly, denoting the conditional pdf of the interference power given by (7.3.37) as f I/Id , 
it follows that 


f,(I) = Jfi/i d (I)f, d (l d )dI d (7-3.46) 

Let the outage probability P° denote the probability that I S S/p 0 in the presence of a 
single interference. The probability may be evaluated using (7.3.27), that is. 
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(7.3.47) 


= J 4(S) Jf^dldS 
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Substituting for f s and fj from (7.3.45) and (7.3.46) in (7.3.47), and evaluating the resulting 
double integral leads to an expression for the outage probability, P°. It can be achieved 
by following a procedure similar to that used by [Fre79] for converting a double integral 
into a single integral to evaluate the outage probability in the presence of Rayleigh fading 
and shadowing. 

In [Abu91], a slightly different approach was used to obtain P°. It uses the expression 
for P° in the absence of shadowing given by (7.3.38), and averages it using the joint pdf 
of the signal power and interference power to include the effect of shadowing. Denoting 
the joint pdf of S d and I d by f s _ |ld , it follows from (7.3.39) and (7.3.40) that 


Wd^d^d) 


2710 , 0 ^ 


exp 


(i d -i d ) 2 (s d -s d ) 2 

2°i 2 2c s 2 


Noting from (7.3.30) and (7.1.3) that 
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(7.3.48) 
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substituting for ji from (7.3.49) in (7.3.38) and averaging the result using (7.3.48) yields 
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(7.3.50) 


where 


o = o s = o, (7.3.51) 

and m and mj are the signal and interference fading parameters, respectively, and ji a is 
the ratio of the area mean signal power to the area mean interference power. Using S d and 
I d , ji a can be expressed as 


ha =10 


For the Rayleigh fading case. 


m = m 1 = 1 


and thus (7.3.50) becomes 
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(7.3.54) 
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The integrals in (7.3.50) and (7.3.54) can be evaluated using numerical methods for a 
suitable choice of p 0 . 


7.3.4 Error Rate Performance 

Let P e (y) denote the conditional probability of error for a given SNR y for a particular 
modulation technique, while P e denotes the average BER. In fading conditions when y is 
a random variable with pdf f , P e is obtained from P e (y) by averaging over all y using 
(7.3.5). For various modulation schemes, P e (y) in additive white Gaussian noise is given 
below [SklOl]. 

For coherent binary phase shift keying (BPSK), 


P(y) = Q(\/2y) 


= erfc. y 
2 Mi 


differentially coherent binary phase shift keying (DPSK), 


(7.3.55) 
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coherent orthogonal frequency shift keying (CFSK), 
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= — erfc 
2 


T 

A/2 


and noncoherent orthogonal frequency shift keying (NCFSK), 


(7.3.56) 
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where 
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and related to erfc(x) as 
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For example, consider a NCFSK system in additive white Gaussian noise and Rayleigh 
fading conditions. For this case, P e (y) is given by (7.3.58). In the Rayleigh fading environ¬ 
ment with pdf of y, f y is given by (7.3.12). Substituting for f y and P e (y) in (7.3.5) and denoting 
the BER for this case by P^ncfsk/ it follows that 


pi _i _1 

^e.NCFSK ~~ J 2p 6 " e r dy 
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(7.3.61) 


where T is the mean signal-power to noise-power ratio. 
Carrying out the integral, (7.3.61) yields 


L e,NCFSK ' 
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(7.3.62) 


Similarly, (7.3.5) may be used to evaluate the BER for other modulation schemes in fading 
conditions when conditional BER for a given modulation scheme and the pdf of yin fading 
conditions are known. 

In Rayleigh fading channels, expressions for BER for coherent BPSK, CFSK, and DPSK 
are given by [Pro95] 
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(7.3.64) 
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It follows from (7.3.62) and (7.3.65) that the average BER for NCFSK may be obtained 
from the average BER for DPSK by replacing F with T/ 2. Similarly, it follows from (7.3.63) 
and (7.3.64) that the average BER for CFSK may be obtained from the average BER for 
BPSK by replacing T with f/2. 


Notation and Abbreviations 


BER bit error rate 

BPSK binary phase shift keying 

DPSK differential phase shift keying 

FSK frequency shift keying 

NCFSK noncoherent FSK 

A amplitude of line-of-sight component 
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real part of r(t)e) 6 W 
imaginary part of r(t)ei 0 W 
channel gain 

cumulative distribution function 
cumulative distribution function of RV x 
carrier frequency 

probability density function of random variable x 

conditional pdf of x for given y 

complex baseband signal 

total power from all interference 

mean power of single interference 

modified Bessel function of the first kind and zero order 
interference power in dB 
mean value of I d 

instantaneous power of ith interference 

incomplete beta function 

mean power of ith interference 

number of interferences 

Rice distribution parameter 

accounts for large-scale fading 

Nakagami fading parameter 

Nakagami fading parameter for ith interference 

mean noise power 

path loss exponent in mobile communications 

probability density function 

probability of an event [ ] 

average probability of error 

probability of error for given y 

BER for modulation method x 

received power in free space 

transmitted power of an isotropic source 

outage probability 

outage probability when K interferences are present 
signal power 

amplitude of the ith interference 

distance between transmitter and receiver 

random variable 

reference distance 

accounts for small-scale fading 

received signal amplitude 

local mean received signal power 
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S(R) area mean, mean signal power received at R distance from transmitter 

S d local mean power in dB 

area mean power in dB and mean value of S d 
x 0 (t) received signal 

x 0 (t) received signal in complex form 

x 0s zero mean, Gaussian random variable with standard deviation o s (dB) 

r mean signal power to noise power ratio 

T(p) gamma function 

a, a(t) normalized signal envelope 

y instantaneous signal power to mean noise power ratio 

y 0 threshold value of y for outage 

(])(t) phase of complex baseband signal 

0 (t) phase delay introduced by channel 

o standard deviation of a(t) and b(t) 

G[ dB spread parameter of interference 

o s dB spread parameter of signal 

p signal to interference power ratio 

p 0 threshold value of p for outage 

p ratio of mean powers of signal and interference 

p a ratio of area mean powers of signal and interference 

x|/j phase of the ith interference 

ft E[r 2 ] 
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In the presence of multipath fading channels, the received signal experiences great atten¬ 
uation while the channel is in deep fade, resulting in the loss of transmitted information. 
This loss can be reduced by combining signals received over several independent fading 
channels. The reason for the reduction in information loss is that the likelihood of all 
signals experiencing deep fade simultaneously is considerably less than that experienced 
by individual signals. 

The process of combining several signals with independent fading statistics to reduce 
large attenuation of the desired signal in the presence of multipath channels is referred 
to as diversity combining [Jak74, Bre59]. There are many ways by which several indepen¬ 
dent fading copies of a signal may be provided to a receiver for diversity combining. Some 
of these are described below. 

Frequency diversity: A signal may be transmitted using several carriers such that the 
separation between successive carrier frequencies is longer than the coherence 
bandwidth of the channel to ensure that the fading associated with different 
frequencies is uncorrelated. 

Time diversity: In this method, several copies of the signal are transmitted using 
different time slots such that the separation between successive time slots is more 
than the coherence time of the channel. 

Space diversity: This method uses multiple antennas and is the subject of this chapter. 
The method requires that the separation between multiple antennas should be 
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sufficient enough for the various signals to be uncorrelated. The multiple antennas 
may be used at a transmitter, at a receiver, or at both places depending on the 
application. This chapter considers various diversity combining schemes using 
single transmitting antennas and multiple receiving antennas. 

Predetection and postdetection schemes: A diversity-combining method maybe classified 
as a predetection or postdetection method. Predetection diversity-combining 
methods combine the received signals prior to detection and use single detectors 
to receive the information. Postdetection diversity methods, on the other hand, 
employ separate detectors on each branch and then combine the signals from 
different branches. 

In this chapter, various diversity schemes are described, and their performance is analyzed 
and compared with that of a system using single receiving antennas. 

There are basically two performance parameters, namely the outage probability and the 
average bit error rate (BER), to denote the performance of a diversity combiner. The outage 
probability is the probability that the SNR y is below some threshold value y 0 . It is given 
by (7.3.4). The average BER is determined by averaging the conditional BER P e (y) for a 
given SNR over all values of y. An expression for the average BER is given by (7.3.5). The 
conditional BER P e (y) is a modulation dependent quantity and is available in most text¬ 
books on digital communications such as [Cou95, Pro95, SklOl]. Expressions for condi¬ 
tional BER for coherent binary phase shift keying (BPSK), differentially coherent binary 
phase shift keying (DPSK), coherent orthogonal frequency shift keying (CFSK), and non¬ 
coherent orthogonal frequency shift keying (NCFSK) are given by (7.3.55), (7.3.56), (7.3.57), 
and (7.3.58), respectively. 

The outage probability P° is a predetection parameter, and thus requires the pdf of y at 
the input to the receiver. The average BER P e for predetection diversity schemes may be 
determined using the pdf of y at the input to the receiver. For postdetection diversity 
schemes it may be determined using the pdf of y at the output of the receiver. 

In view of the above discussion, it is clear that we need to determine the pdf of y at the 
input to the receiver to determine P° and P e for the predetection diversity schemes and 
pdf of y at the output of the receiver to determine P e for the postdetection diversity 
schemes. 

Consider a diversity-combining system consisting of L antennas as shown in Figure 8.1. 
It is assumed that the signal is transmitted using a single antenna. Thus, the system consists 
of L diversity channels carrying the same information. It is assumed that these channels 
are slow fading and frequency nonselective. Furthermore, the fading processes among 
these channels are mutually statistically independent. 



FIGURE 8.1 

Block diagram of a diversity combining system. 
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The signal received on each antenna is weighted and summed to produce the output 
y(t) of a diversity combiner. Let X;(t) denote the signal induced on the ith antenna due to 
a desired signal source, and K, cochannel interference and uncorrelated noise. In the 
complex form and omitting the carrier terms for ease of notation, 

x i(t) = r i (t)g(t)e ie ‘ w + ^q lj (t)g j (t)e i¥ii(t) +n ; (t) (8.1) 

j=i 

where r;(t) and 0 j(t), respectively, denote the amplitude and phase of the desired signal 
received on the ith branch; q^t) and ^(t), respectively, denote the amplitude and phase 
of the jth interference received on the ith branch, g(t) denotes the designed message, gj(t) 
denotes the jth interference message, and n^t) denotes the zero mean, Gaussian noise of 
variance (noise power) N present on the ith channel. 

It is assumed that for any time t r i; i = 1, 2, ..., L are i.i.d random variables (RVs) with 
a specified distribution; q,j, i = 1, 2, ..., L and j = 1, 2, ..., K are i.i.d RVs with a specified 
distribution; 0;, i = 1, 2, ..., L are i.i.d RVs uniformly distributed in [0,271); \|/jj, i = 1, 2, ..., 
L and j = 1, 2, ..., K are i.i.d. RVs uniformly distributed in [0,271); and r ; , n,, q^, 0 ; , and 
i = 1, 2, ..., L and j = 1, 2, ..., K are mutually independent. 

Fading on various channels is assumed to be independent and represents the effect of 
small-scale fading unless stated otherwise. In other words, the results presented are for a 
given large-scale fading. It is assumed that all channels have the same mean signal power 
S and the mean interference power B due to jth interference. For identical interference, it 
is denoted by I, that is. 


I = F, j = 1, 2,..., K (8.2) 

The instantaneous signal power and interference power on the ith branch is denoted by 
Si and respectively. These are related to r, and q^ as follows: 

8 . 4 (8 - 3) 

and 

<8 ’ 4) 

j=l 

The received signal from each channel is multiplied by a complex weight before combin¬ 
ing. Let wf denote the weight of the ith channel. It then follows from Figure 8.1 that the 
combiner output y(t) is given by 


L 

y(t) = £w-x.(t) (8.5) 

i=l 

where * denotes the conjugate of the complex quantity. 

Define an L-dimensional complex vector Cg, referred to as the signal channel gain vector, 
to denote the instantaneous channel gains for the desired signals received on L branches, 
that is. 
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C s = a i e i6 ‘. 


i = l, 2 ,..., 


L 


( 8 . 6 ) 


where a; and 0; are the magnitude and the phase of the channel gain of the ith branch, as 
discussed in Section 7.2. 

Similarly, define interference channel gain vectors Cjj, j = 1,2,..., K to denote KL channel 
gains. It should be noted that due to independent assumptions for the signal and various 
interference envelopes, C s and C ( j, j = 1, 2, ..., K are mutually independent. 

Let an L-dimensional vector be defined as 


x (t) = [ x i(t), x 2 (t), x L (t)] T (8.7) 

It follows from (8.1) and the discussion in Section 7.2 that x(t) can be written as 

K 

x (t) = VPsg( t ) C s +^ A /p^gj( t ) C ij + n (t) ( 8 - 8 ) 

j=l 

where the L-dimensional vector n(t) denotes the noise on L channels, p s denotes the mean 
signal power, and pjj denotes the mean power of the jth interference. 

Let R denote the array correlation matrix. It follows from (8.8) that for given C s and C t j, 
j = 1, 2, ..., K, it is given by 


K 

R = p s C s C" + £p Ij C Ij C« + NI (8.9) 

j=i 

Defining 


w=[w,,w 2 ,...,w L ] T (8.10) 

it follows from (8.5) that the output of an L-branch combiner can be written in vector 
notation as 


y(t)=w H x(t) (8.11) 

The way that the weights on various branches are selected determines the type of 
diversity combiner being employed. Several of these diversity schemes are now considered. 


8.1 Selection Combiner 

In this case, one of the L-diversity signals is selected for further processing. Thus, 


w 


1 


if 1=1, 
otherwise 


( 8 . 1 . 1 ) 
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where 1 0 denotes the selected branch. Theoretically, one would like to select the branch 
with the highest signal to noise ratio or in the interference limited system, with the highest 
signal to co-channel interference ratio. In practice, however, it is easy to implement a 
scheme that selects a branch with the largest power. 

Now we analyze the performance of a system using L branch selection combining 
scheme. Both noise limited and interference limited systems are considered [Abu92, 
Abu94, Abu94b, Cha79, Sim99]. 


8.1.1 Noise-Limited Systems 

The analysis of noise-limited systems consists of deriving expressions for the outage 
probability, the mean signal-to-noise ratio (SNR) and the average BER. 

8.1.1.1 Rayleigh Fading Environment 

First, consider that the system operates in the Rayleigh fading environment. 

8.1.1.1.1 Outage Probability 

Denoting the instantaneous SNR at the 1th branch by y, it follows from (7.3.13) that 


p (Yi^Y 0 ) = 1-e r 


( 8 . 1 . 2 ) 


where T denotes the mean SNR at each branch. 

Let B)? denote the outage probability of the selection combiner (SC). Then I*? is the 
probability that the instantaneous SNR in all L branches is simultaneously less than or 
equal to y 0 . Assuming that the fading on each branch is independent, it follows from (8.1.2) 
that 


P s°c = p (Yi, •••' Y l - Y 0 ) 

= p (Yi ^ Y 0 ) p (y 2 ^ Yo)-" P (y l ^ Y 0 ) (8-1-3) 

( J&Y 

= 1-e r 

v 


8.1.1.12 Mean SNR 

Let Tgj- denote the mean SNR of an L-branch selection combiner. An expression for the 
mean SNR T^ may be obtained as follows. 

The mean SNR is given by 

r sc = J Q Yf ySC (Y)dY (8-1.4) 

where f ySC denotes the pdf of the instantaneous SNR of the received signal using the 
L-branch SC. It is related to the cdf of y by 


f ySc(Y) 


dP y S c(Y) 

dy 


(8.1.5) 
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Noting from (7.3.4) that P° = F y i 


(y 0 ), it follows from (8.1.3) that 


F y sc(y) = 


< y h 

1-e F 

V 


From (8.1.5) and (8.1.6), 


£ysc(Y) = 


( _y\ 

l-e' r 

v j 




v k , 




-^(k+l) 


k=0 


The second step follows using the binomial expansion, that is. 


(l-x)“ =£(-!)' 


6 n Y 


Substituting from (8.1.7) in (8.1.4) and evaluating the integral [Jak74], 


r =r 




( 8 . 1 . 6 ) 


(8.1.7) 


( 8 . 1 . 8 ) 


(8.1.9) 


It follows from (8.1.9) that the mean SNR of the processor becomes improved by using 

^ 1 

an L-branch SC. The improvement factor is given by . 


8.1.1.1.3 Average BER 

The average BER at the output of SC can be obtained by averaging the conditional BER 
for a given y over all values of y. Thus, 

PM>)WY) d Y (8-1.10) 

where P e (y) denotes the conditional BER at the output of SC for a given value of y for a 
particular modulation scheme, and f yS c(Y) denotes the pdf of y. 

Consider an example of a coherent BPSK system. For this case, P e (y) is given by (7.3.55) 
and in Rayleigh fading environment with independent fading, f y sc(Y) is given by (8.1.7). 
Substituting these values and evaluating the integral using the identities 


J erfc( x je “ X dx = 

o 

Jerfcj , x jxe~'“dx = 

o 


1- 


yi+a 



l 

v 1 + a 


a 

2(1 + a)v 1 + cx 


( 8 . 1 . 11 ) 
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one obtains [Eng96] 


L e,BPSK 


k=0 


L-l 


(-i) k 


l + k 


1- 


1_ 

L l + k 

. 1 +- 

v r 


( 8 . 1 . 12 ) 


Similarly, the result for DPSK may be obtained and is given by [Eng96] 


L e.DPSK ' 


k=0 


L-l 


(-l) k 


1 

r+i+k 


(8.1.13) 


Note that the results for CFSK and NCFSK may be obtained by replacing T by T/2 in 
(8.1.12) and (8.1.13), respectively. 


8.1.7.2 Nakagami Fading Environment 

First, consider the pdf of the SNR at the output of the selection combiner. 

8.1.1.2.1 Output SNR pdf 

An expression for the pdf of y at a single branch is given by (7.3.14). Thus, it follows that 
(7.3.14) denotes the pdf of y at a branch of the SC. Rewrite (7.3.14): 


f y(y) = 


m 


m y m_1 

f(m) 


my 


(8.1.14) 


It follows from (7.3.17) that the corresponding expression for F y (y) is given by 


/ \ -m—1 

F(y) = l-e r 


k=0 


(my/r) k 

k! 


(8.1.15) 


Assuming that the fading on each branch is independent, the cdf at the output of the 
SC is the product of individual cdfs, as denoted in (8.1.6) for the Rayleigh fading case. Thus, 

F 7 s C (y)=( F T (y)) L (8-1-16) 

The pdf of ygc then is 


lySc(y) 

Substituting from (8.1.16), it follows that 


dF y SC (y) 

df 


(8.1.17) 


f ysc(y) = 


d( F y(Y)) L 

df 


(8.1.18) 
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= L (F y ( Y )) L " lf Y (Y) 

which, along with (8.1.14) and (8.1.15), implies that 


hscV 


( m 

\ m v m-l my 

1 * P r 

\r 

J r(m) [ 


1-e 


-mi ^ (my/r) 


kA 


k=0 


k! 


(8.1.19) 


Using binomial expansion (8.1.8), (8.1.19) can be expressed as [Abu94b] 


f Y c D M = L 


= L 







( 


f m ' 

f r- 1 e 

y L-l , 

■ m F y 

'L-U 

t-i y 

y m-l 

e“ m r y { 

Y) 

\r, 

J r(rn) 

i=0 ' 

, i J 

|( u 

e L 

^ k=0 

k! 

y 


m 

Y) r(m)i 


kjEr' 1 )'- 1 ' 1 ! 

’ i=0 X J ieB 


nfi+i)! im) ji y 


A.. 


( 8 . 1 . 20 ) 


where B is a set of all possible nonnegative integer combinations such that 


and 


m-l 


E n k =i 

(8.1.21) 

k=0 


Cji = n i + 2n 2 +... + (m - l)n m _j 

(8.1.22) 

A- = (2!)" 2 (3!) n3 ... ((m -1)!)" 1 ” -1 

(8.1.23) 

d = l! 

J1 nJn.!n.!...n .! 

(8.1.24) 


Now, consider a dual diversity system (L = 2) operating in a Nakagami fading environ¬ 
ment with fading parameter m [Sim99]. For this case, an expression for the pdf of the SNR 
at the output of the SC is given by 


f y S cM = 


m 


r ( m ) r i 




v r iy 


exp 


1 my^ 

V r i Jl 


( 


1 -- 


A 


my 

v h; 

r( m ) 


m 


r(m)r 2 




V r 2l 


exp 


my 


( 


1- 


A 


my 

m ' r 

v_ Yj 

r( m ) 


(8.1.25) 
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where r\ and r 2 denote the mean SNR at Branch 1 and Branch 2, respectively, and r(m,x) 
is the incomplete gamma function. 

For the integer value of the fading parameter m, r(m,x) has a closed-form solution, and 
thus (8.1.25) becomes 


m 


f y sc W (m -1)! Tj 


f .. \ 


v r iy 


exp; 


f my^ 

v r w 


1-H(y,r 2 ,m)] 


(8.1.26) 


m 


(m — 1)! T 2 


( .. Y 


V r 2 2 


exp 


my 

v'^T 


[i-H(y,r„m)] 


where 


H(y,r i ,m) = exp; 


my 


my 


V 1 7 , i = 1, 2 
. r . \L* k! 

V 1 / k=o 


(8.1.27) 


8.1.1.2.2 Outage Probability 

It follows from (8.1.15) and (8.1.16) that the outage probability in the presence of Nakagami 
fading is given by 

F ySC = ^ySc(To) 



(8.1.28) 


, 10 , 


1-e 


k=0 


( m To/ r ) 
k! 




For a dual-diversity system, an expression can be derived for F ySC (y), the cdf of the SNR 
at the output of the SC in the Nakagami fading environment for integer values of m, by 
integrating (8.1.26). It is given by [Sim99] 


F y scM = 1 - H (T,r i ,m) + l-H(y,r 2 ,m) 


where 


y (n + m-1)! 
L-t n!(m-1)! 


( r i+r 2 ) n+m 


l-H 


Y/ 


r i r 2 

r 1 + r 2 


\ 


(8.129) 


H n(Y,r,m) = exp 




(8.1.30) 


and FI 0 (y,r,m) is equal to H(y,r,m) given by (8.1.27). The outage probability F)’ is then 
given by 

P°c=Pysc(Yo) (8.1.31) 
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8.1.1.2.3 Average BER 

Using the conditional BER given by (7.3.58) for an NCFSK system and the pdf given by 
(8.1.20) in Nakagami fading environment (integer values of m), the average BER can be 
obtained using (7.3.5). An expression is given by [Abu94b] 


pSC 


e 


L 1 

2f(m) 


£h)' 



r| 

C.. +m) 

J 1 / 

( \ 

m 

A ji 

v m(i + l) + .5T y 


(8.1.32) 


The result for a DPSK system is given by replacing T with 2T in (8.1.32). For a dual¬ 
diversity system, the average BER can also be obtained using the pdf given by (8.1.26). 
An expression for the average BER for a DPSK system is given by [Sim99] 


psc 

1 e 


1 

2 


L 


f m ] 

m „ m-1 / 

m(n + m-l)! 


f 

\ 

m 

l m + r J 

2 n!(m-l)! 

(r 1+ r 2 p J 

m + 

^2 




V 



(8.1.33) 


The result for NCFSK can be obtained from (8.1.33) by replacing T; with Tj/2, i = 1, 2. 


8.1.2 Interference-Limited Systems 

Assume that a desired signal and K co-channel interferences are present in a Nakagami 
fading environment. Assume that all interferences have the same statistics with the fading 
parameter denoted by m, the same as the desired signal, and have equal mean power 
denoted by I. 

Now, selection combiner performance using three possible selection algorithms 
[Abu92] — desired signal power algorithm, total power algorithm and signal-to-interfer- 
ence (SIR) power algorithm — is presented. 


8.7.2.1 Desired Signal Power Algorithm 

In this algorithm, the selection combiner selects the branch with the largest desired signal 
power. Let S, denote the signal power of the ith branch, that is. 



(8.1.34) 


When all branches experience independent fading, it follows from (7.1.54) that the pdf of 
Sj is given by 



(8.1.35) 


where S denotes the mean signal power of each branch. 

Let 1, denote the total interference power received on the ith branch, that is. 



j=i 


(8.1.36) 
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The pdf of Ii, as seen in (7.3.22), is given by 



T(mK) eXP l 



(8.1.37) 


Let jj-g,, denote the SIR power ratio at the output of the selection combiner. As the SC selects 
only one branch for processing, p^ also denotes the SIR at the selected branch. The 
probability that p^, is less than or equal to the threshold p 0 is given by 


= f 


(8.1.38) 


where f Msc (p) denotes the pdf of the SIR at the output of the SC, and is given by [Abu92] 


f M.» = 


LA 


LLm' 


fL-ll 

| d j. | 



rlm + mK + Cjjj 

l i J 

W 

vS. 

J 

m m(i + l) 

m+mK+Cjj 





H—III 

T= 

° 1 
H 

eoi 



(8.1.39) 


where 


A = 



m T 

IsJ 

tJ 


(8.1.40) 


B, Cjj, Ajj, and d^ are given by (8.1.21) to (8.1.24). 

Substituting for f^p) in (8.1.38) and carrying out the integral [Abu92], 


T l- 1 ij n w 

P ° = L yy y ] 


t=o 


A. 


V 1 ji 


rlm + mK + CjJ 


( m + C H -C 

1 (-1)‘ 

(A) 

mk 

(n)‘ i 

l * j 

1 (i + l) m+mK+ C i j 

v^o J 


t + mK |(q) mK+t 


(8.1.41) 


where 


n = 


p 0 (i + l) 


(8.1.42) 


with p denoting the average signal power S to average interference power I ratio at one 
branch. 
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For the Rayleigh fading environment (m = 1), (8.1.41) becomes 


L-l 

p s o c=L^(-iy 

i=0 


fL-iy 

| 1 

fAl 

K , 

1 

l i J 

1 (i +1) 

vho ) 


(n) k 


(8.1.43) 


which, for a single interference (K = 1), reduces to 


p;=Ll(-»' a " n 


=0 


v 1 j 


(i + l) + - 


(8.1.44) 


ho 


Note that for L = 1, (8.1.44) reduces to (7.3.35), the result for the single-antenna system. 


8.1.2.2 Total Power Algorithm 

In this algorithm, the branch with the largest total power is selected. Thus, Branch 1 is 
selected if 

S 1 + I l > S- + L j = 2,..., L (8.1.45) 

For this case, the outage probability is given by 


P s 0 c=PMbo] 

= LP y^p^ + I^Sj+L, j = 2,..., L 
_n 

This expression is evaluated in [Abu 92], resulting in 


(8.1.46) 


P s °c = L|;du[f UI (u,v)(p[S i + T < v]) L "dv 
where f uv is the joint pdf of u and v, given by 

-mv( u It 
l+uls + lj_ 

with A given by (8.1.40) and 


^m-l^.m+mK-1 

f uv( u ' v ) = A ^-^THirexp 

1 + u 


(8.1.47) 


(8.1.48) 


(8.1.49) 


and 


v = S 1 + I 1 


(8.1.50) 
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(8.1.51) 


f mv ')\-i ^ [ (-1)' " fj Y mv Y 

UxJ 


r(mK + j-n) 

(h)^ 1 

r(mK) 

(M mK+i “ n 


-1 mK+j-n—1 f j ^ 


H-T LL I 


j=0 n=0 z=0 


n 



f mv^ 

n+z r(mK +j-n) (^i) mK 

j! 

l s J 

r(mK)z! (^-l) mK+j - n - z 


For dual-branch diversity systems, (8.1.47) reduces to 

K = 2{R 1 -R 2 -R 3 +R i ) 


where 


r(m + mK) v - ’ f m - l'i (-l) m 1 1 


L in — 

i 


1 r(m)r(mK) - 


i - m - mK +1 


ho 


\ i+l-m-mK 


+ 1 


-l 


V n j 


R 2 = 


1 

r( m )r(mK) L 


r(m + mK + 1 ), 


Ho 


u m_1 (u + 1)‘ 


t! 

j 


lY 

n4 

“ 

m+mK+t 


0 

u 

+ 2 





. hj 




du 


R 3 = 


r(mK + j-n) 
r(m)r(mK)i(ij j'I-(mK) 




and 


v n / 


(.-I)n-^ r um - 1( uH-ir 

o [2U + H + 1 


-du 


-1 j mK+j-n-1 


R4 r(m)r(mK) 


rfmKilE E 


j=0 n=0 z=0 


r(j + mK-n) 
j!z!r(mi<) 


.(_iy- n ^ (Iip-z-.i-.jp™ f u m H u+l) 

W ' Ir L.i 


r(n + z + mK + m) 


m-l/ , 1 \n+z 


r ( y 


m+mK+n+z 

U 1 + - 

+ 2 


V hj 




du 


(8.1.52) 


(8.1.53) 


(8.1.54) 


(8.1.55) 


(8.1.56) 


© 2004 by CRC Press LLC 



8.1.2.3 SIR Power Algorithm 

In this algorithm, a diversity branch with the highest signal power to interference power 
ratio is selected and an expression for outage probability is given by [Abu 92] 


P S °c=PMbo] 

= { P [^sc - h 0 an d L = l]] L 


(8.1.57) 


+ 1 


v^o 


Li 

i=0 V 


m + i-lV u ^ 
1 + ^ 


For the Rayleigh fading environment (m = 1) and one interference (K = 1), this reduces to 


P° 

SC 


( 


\ 


L 


ho 
Vho+hy 


(8.1.58) 


8.2 Switched Diversity Combiner 

The switched diversity scheme, also known as scanning diversity, is similar to the selection 
diversity discussed in the previous section except that in this scheme signals received on 
L branches are continuously scanned in a fixed sequence until one is found above a given 
threshold, rather than using the best one as is done in selection diversity. For example, 
when the total received power is considered, the received power of the selected branch 
is continuously compared with a given threshold value, q 0 . Until the received power 
remains above q 0 , no switching is done. When it drops below q (l , the next branch is 
examined and switched to the receiver if the received power on this branch is found to 
be above q 0 ; otherwise, the search continues. 

In this section, expressions for the outage probability and average BER in the Nakagami 
fading environment are derived, and the effect of correlation on average BER is examined 
[Abu92, Abu94a, Sch72], 

8.2.1. Outage Probability 

The outage probability for this scheme in the presence of Nakagami distributed interfer¬ 
ences with statistics similar to those described in Section 8.1 can be written as [Abu92, Sch72] 

P S w = P[bsw - ho] 

= P[p sw <p 0 |S 1 + I I <^ 0 , for all i]p[Sj +Ij < ^ 0 , for all i] (8.2.1) 

+ P[p sw < p 0 |Sj +L > ^ 0 , for at least one i]P[Sj +L > ^ 0 , for at least one i] 

where p sw denotes the SIR at the output of a switched diversity combiner (SDC), which 
is the same as the SIR at the selected branch. 
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Now, consider the various terms on the right side of (8.2.1). When Sj + I; < q 0 for all i. 


p [hsw^h 0 ] = 1 (8-2.2) 

as none of the branches is selected, effectively representing an outage. 

From the independence of the desired signal and interferences, it follows that 

P[S, +1; < % 0 , for all i] = {p[Sj +\< ^ 0 ]} L 

and 

P[S ; +L > q 0 for at least one i] = l-P[S i +I i <^ 0 , for alii] 

=i-{p[Si+i 1 <g} L 

In writing (8.2.4), it is assumed without any loss of generality that Branch 1 is selected. 
It follows from (8.2.1) to (8.2.4) that 


(8.2.3) 


(8.2.4) 


C = {p[s 1 +i 1 <q 0 ]}' 


^Hofr + I^o 


1-{P[ S ! +I 1 < ^o]} 1 


(8.2.5) 


A further manipulation of (8.2.5) leads to [Abu92] 


i- 


I r ,.L i ( p [S, + I,<5 0 ]) L ) 

c=( p hu,<y} 


where 


( 8 . 2 . 6 ) 


r(m + mK) 
1 “ r(m)r(mK) 


m +mK -1 / c- \ j 


(p) mK £ 


I 

2S J j! 


x f e \ 

-mq 0 


J 


\ i-m-mK 

u + p) u 

(l + uf exp ' 


.(l + u|2S ,ll+P) , 


du 


(8.2.7) 


and P[Sj + Ij < i; 0 ] is given by (8.1.51). 


8.2.2 Average Bit Error Rate 

In this section, average BER is examined by considering an example of noncoherent 
detection of binary FSK (NCFSK) signals in additive white Gaussian noise (AWGN) for a 
two-branch diversity system in a slow Nakagami fading environment [Abu94a]. 

Assume that the switching is done at discrete intervals of time t = nT, where n denotes 
an integer and T is the time interval between samples. Fet a n and b n denote the samples 
of the signal envelopes at two antennas at time t = nT, and X n = 1/2 and Y n = 1/2 ty 2 
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denote their respective local signal powers. In the Nakagami fading environment, a n and 
b n are Nakagami distributed RVs, and X n and Y n are gamma distributed. 

Assume the following switching scheme is employed. Let the antenna selected at t = 
(n- 1)T be Number 1. Switching to antenna Number 2 is done iff X n < q 0 . Next, let S n 
denote the local signal power at the output of the switched diversity system. It follows 
from the above switching strategy that 


S n = X n iff 


S n -i = X n _j and X n > £ 0 


or 


( 8 . 2 . 8 ) 


S . = Y .andY ,<£„ 

n-1 n-1 n-1 


S n = Y n as above with X and Y interchanged. 

When the fading at two antennas is independent, the pdf of S n is given by [Abu94a] 


where 


f s» = 


B 


m 


mlS 


u 


S (m-1)! 


(1 + B) 


m 


mlS 


u 


S (m-1)! 


u ^o 

U >^0 


B = l-e 


c L 1 



i=0 


(8.2.9) 


( 8 . 2 . 10 ) 


Let y n denote the instantaneous SNR at the output of the system at time t = nT, defined as 


T, 



( 8 . 2 . 11 ) 


with N denoting the variance (noise power) of zero mean AWGN. 

Following a procedure similar to that used in Section 7.3.1, it can easily be shown that 
when the fading at two antennas is independent, the pdf of y n obtained from the pdf of 
S n is given by 


f Tn (y) = 


B 


\ m-1 

my] 
m( r J 
r (m-1)! 


my 


(l+B) 


m 


tT 


my 


r (m-1)! 


y< 


y > 


N 

k. 

N 


( 8 . 2 . 12 ) 


Substituting for the pdf of the instantaneous SNR given by (8.2.12) and the conditional 
probability of error for the NCFSK system given by (7.3.58) in (7.3.5), and evaluating the 
integral for the average BER of the NCFSK in the Nakagami fading environment becomes 
[Abu94a] 
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(8.2.13) 


psw 

A e 


l_ 

2 | —+ 1 

, 2m 


1-e 


m ^o Y 

r r 


r 


+ e 


-So 


-J?c 


1 m 

—I- 

2 r 


j=o 


It should be noted that P, w depends on the threshold value of signal power q,, used for 
switching. The optimum value of the threshold q 0 may be obtained by solving 


dP s 


d^ 0 


= 0 


5o=5o 


(8.2.14) 


Substituting from (8.2.13) in (8.2.14) and solving for | 0 yields 


t = 2m In-11 

1 2m 


(8.2.15) 


For the Rayleigh fading environment, m = 1. Thus, substituting m = 1 in (8.2.13) and 
(8.2.15), expressions for P” and q 0 in Rayleigh fading channels become 


P SW = - 


2+r 


1-e r +e V2 r 
V J 


(8.2.16) 


and 


^o = 21n - + 1 


(8.2.17) 


8.2.3 Correlated Fading 

The expression for average BER given by (8.2.13) is derived when the fading at two 
branches is independent and there is no correlation between the signal envelopes. Now 
assume that the two are correlated with the power correlation coefficient k 2 defined as 


E 

( a n- E K]) 

(k- E lN)’ 


[ a n- E [ a n] 

E [ b »“ E [b;( 


(8.2.18) 


The average BER for this case is a function of k 2 and is given by [Abu94a] 


psw 

1 e 


=c r(m) 


D n 


m-l 

l-e<- D )r^y 


feD)' + e 


-5c 


2 | —+ 1 

,2m 


ej f 1 m 

m-l L, n -1- 

r 


j=0 


V- 


(8.2.19) 
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where constants C and D are defined as 


and 



D = 


m 


(l-k 2 )r 


1 - 


mk" 




m + 


( 8 . 2 . 20 ) 


( 8 . 2 . 21 ) 


The optimum threshold q (> which minimizes the P, w for this case, becomes 




l + 2in -2D 

r 


In 


2Cr(m) 


( 8 . 2 . 22 ) 


For the Rayleigh fading case, substituting m = 1 in (8.2.19) and (8.2.22), expressions for 
P™ and |o become 


and 



I 


o 



where constants G and F are defined as 


G = 


1 

2 r+r 2 (i-k 2 ) 


(8.2.23) 


(8.2.24) 


(8.2.25) 


and 


F = 


r(l-k 2 ) 


1 - 


i + i(!- k2 )r 


(8.2.26) 


These equations can be used to evaluate the average BER and optimal threshold values 
for various fading parameters. Using these equations to plot the optimal threshold as a 
function of SNR, it is reported in [Abu94a] that q 0 is an increasing function of SNR and 
the fading parameter m, and a decreasing function of the correlation coefficient. 
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8.3 Equal Gain Combiner 

In equal gain combining, the desired signals on all branches are co-phased and equally 
weighted before summing to produce the output. Without loss of generality, assume that 
each channel has a unity gain. Thus, the weights of an equal gain combiner (EGC) are 
given by 


w ,=e i0 ‘ i = 1,2,..., L (8.3.1) 

and the signal envelope at the output of the EGC is sum of L signal envelopes, that is, 

L 

r = J\ (8.3.2) 

i=l 

In this section, the performance of an EGC in both noise-limited and interference-limited 
environments in the presence of Nakagami fading is analyzed. 


8.3.1 Noise-Limited Systems 

First, consider a noise-limited system [Abu92, Abu94, Bea91, Zha97, Zha99]. In this section, 
expressions for the mean SNR, outage probability, and average BER for the EGC are 
derived. 


8.3.1.1 Mean SNR 

Let Seq denote the instantaneous signal power at the output of the EGC. It follows from 

(8.3.2) that it is given by 


S 


EG 



(8.3.3) 


When each branch has the same noise power N, the total noise power at the output of 
the EGC is equal to NL, as each channel has a unity gain and the output SNR y is given by 


T = 



2LN 


(8.3.4) 


Let T EG denote the mean SNR at the output of the EGC. Thus, (8.3.4) implies that 


r EG =E[y] 


.EH 

2LN 


(8.3.5) 
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It follows from (8.3.2), assuming independent fading, that 


E[r 2 ] = E 


Hr. r. 


E 

tj=l 

E ■bbE E i r ii E [ r i 


i^j=i 

where the last step follows from the independent fading assumption. 
Denoting the mean signal power at each branch by S, it follows that 

4d= 2S < i=i - 2 . 1 

The second term on the RHS of (4.3.6) can be evaluated by noting that 


(8.3.6) 


(8.3.7) 


E [ r i] = J r i f r i ( r i) dr ^ i = 1, 2,..., L (8.3.8) 

o 

where f r . denotes the pdf of the signal envelope at the ith branch. 

For Nakagami distributed signals, the mean SNR is given by (7.1.43). Substituting in 
(8.3.8) and evaluating the integral [Pro95], 


%] = 


r m + - 


a 2SV2 


r( m ) 


i = l, 2,..., L 


(8.3.9) 


Substituting from (8.3.7) and (8.3.9) in (8.3.6) and using (8.3.5), one obtains the following 
expression for the mean SNR of the EGC for independent Nakagami fading: 


1 


r ( m+ i) 

( 2S^ 

1 

2' 

2LS + (L 2 -L) 




2LN 

r(m) 

l m J 





- 



, 


(8.3.10) 


r 

1 + 


2 > 

L-l 

r( m ) 

m 

V 


y 


where T = — denotes the mean SNR of a single branch. 
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For the Rayleigh fading case, m 
(8.3.10) becomes 


1. Substituting for m 


1 and noting that 



2 ' 


- EG 



(8.3.11) 


yielding an expression for the mean SNR of the EGC for independent Rayleigh fading. 

8.3.1.2 Outage Probability 

Estimation of outage probability requires knowledge of F y , the cdf of y. It can be obtained 
from F r , the cdf of r as follows. Let F x and F Y denote the cdfs of two RVs X and Y. If X 
and Y are related via 


Y = X 2 ,X>0 


then 


F Y (q) = P[Y<q] 

= P[x 2 < q] 

= p[x<S\ 

Vs 

= j f x( X ) dx 

0 

=F,(b) 

Thus, it follows from (8.3.4) that F y (y) is given by 

F y (y) = F r (y 2LNy) 


(8.3.12) 


(8.3.13) 


(8.3.14) 


where F r (x) denotes the cdf of r, the sum of L independent RVs r i7 i = 1, 2, ..., L. 

When q, i = 1, 2, ..., L are i.i.d. RVs, and are Nakagami distributed with parameter m, 
F r (x) can be computed within a determined accuracy using the following infinite series 
[Bea90, Bea91]: 

n=l 
nodd 

where 

A n =K+0 2 )2 


(8.3.15) 


(8.3.16) 


6 - L tan" 1 [ <&I cos ( n(d£ ) ~ sin ( n(d£ ) 
1 {<h R cos(ncoe) + T> : sin(ncoe) 


(8.3.17) 


<h R = E[cos(ncox)] 


(8.3.18) 
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(8.3.19) 


<1^ = E[sin ncox 


e = ^ (8.3.20) 

and 

® = Y (8.3.21) 

with x denoting the signal envelope at one of the branches and T denoting the period of 
the square wave used in deriving the series. T determines the accuracy of the results. A 
value of T between 40 and 80 has been suggested in [Bea91]. 

For RVs p, i = 1, 2, ..., L, the Nakagami distributed with parameter m expectations in 
(8.3.18) and (8.3.19) become 




R 


F 

i r il 


V 


1 

; 2 ; 


-rrortP 

2m J 


(8.3.22) 


and 


^ 2S r(m + 0.5) 

4> t = ,-.—r—-nco ,R 

1 V m r(m) 1 1 


1 3 -n 2 co 2 S ^ 

m + —; —;- 

v 2 2 2m , 


(8.3.23) 

where 1 F 1 (.;.;.) denotes the confluent hypergeometric function [Abr72] which is defined as 


1 F 1 (a;b;x) = J2 

n=0 


r(a + n)r(b)x n 
r(a)r(b + n)n! 


This function can be calculated as follows [Bea91]: 


(8.3.24) 


_ i 3 .aV 

i F il m + 2 ; 2 ;_a J = e L 


—i(-1) 


k=0 


v k j 


(2k+ 1)!! 


|2 k a k 


where 


(2k + l)!!=(2k + l)(2k-l)L (3)(l) 


jFj m; — ;-a can be calculated recursively as follows: 


u i 1 ^ 1 4m-53 ,1 

jF m;-;-a =- -\ -a + —-— F m-l;-;-a 

111 2 ) m-1 K 2 1 \ 2 


3-2m f 1 |, 

+ 2 i F i I m 2, 2 / a j f/ 


m > 2 


(8.3.25) 


(8.3.26) 
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jFjfl; —; - al = e -a y a ‘ 

1 \ 2' J t-(2i-l)i! 

when the parameter a is a small quantity. 


(8.3.27) 




, 1.3 1.3.5 1.3.5.7 

1 + - + - y +-o- + . 

a (2a) 2 (2a) 3 


(8.3.28) 


when the parameter a is a medium to large quantity, and 



(8.3.29) 


For a discussion of the diversity gain obtainable using L = 2, 4, and 8, and m = 1, 2, 3, 
and 4, see [Bea91]. The discussion concludes that a transmitter requires 11 dB less power 
using a dual-diversity system in a Rayleigh fading condition (m = 1). The required power 
decreases as diversity branches (L) increase, whereas an increase in power is required in 
more severe fading as m increases. 


8.3.1.3 Average BER 

Calculation of the average BER using (7.3.5) requires knowledge of the SNR pdf for the 
EGC, which is not available. However, it can be obtained using 


P c =jp(r)f(r)dr (8.3.30) 

o 

where r denotes the amplitude of the signal envelope, f r denotes the pdf of r, and P e (r) 
denotes the conditional BER for a given value of r. 

When r is the sum of i.i.d. RV, as is assumed to be the case for EGC, f r is given by [Abu92] 

f (x) = - A n e iTn e“ jnmx + A n e“ iT " e in “ x (8.3.31) 

n=l 
n odd 

where 


X =Ltan 1 

n 


<t>, 




(8.3.32) 


A n , < t> K , <J> : , and co were defined previously. 

Substituting for y from (8.3.4), in (7.3.55) to (7.3.58) expressions for the conditional BER 
as a function of the signal component become 


p e ( r ) = Q 


A/ln 



BPSK 

CFSK 


(8.3.33) 
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and 


P e « 


1 

2 


exp 


( „„2 X 


g r 


l 2LN f 

' 1 


g = 1 
g = .5 


DPSK 

NCFSK 


(8.3.34) 


Substituting for P e (r) from (8.3.33) and f r (r) from (8.3.31) in (8.3.30) and carrying out the 
integral, the average BER becomes [Bea91] 


where 


P = — A B cos(x -a ) 

e rp n n \ n n / 

n=0 
n odd 


B =■ 


1 

1 - exp 

f-n 2 co 2 \ 

-|2 

1 

_i_ 

f 

(nco) 2 

l 4b 2 J 

nb 2 

1M 

V 


l; 


3 -n _ ( 0 ‘ 


4b 


(8.3.35) 


(8.3.36) 


and 


a = tan 


V7C b 


1-exp 


nco 1 F 1 


1; 


( -n 2 co 2 ^ 
V 4b 2 L 

3,-nV' 
2' 4b 2 


b = 


i g 
V 2NL 


(8.3.37) 


(8.3.38) 


where g = 1 for coherent BPSK and g = 0.5 for CFSK. 

Similarly, substituting for P e (r) from (8.3.34) and f r (r) from (8.3.31) in (8.3.30), and car¬ 
rying out the integral the average BER for DPSK and NCFSK systems becomes [Bea91] 


P =-V A D cos(t -B ) 

e rp n n \ n *n/ 

1 n=l 
n odd 

where 


D = 


[ 71 

f-n 2 co 2 ^ 

2 2 f 

|+ nro F 2 

[b 2 exp l 

l 2b 2 J 

1 b 4 


l; 


3 -n co 


2.. 2 X 


4b 2 


p n = tan" 


nco jF 


f 3 -n 2 co 2 ^ 
v'2' 4b 2 , 



f -n 2 co 2 'j 

VtT b exp 

l 4b 2 J 


(8.3.39) 


(8.3.40) 


(8.3.41) 


b is given by (8.3.38), g = 1 for DPSK, and g = 0.5 for NCFSK. 
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The results presented here are for the case in which there is no gain unbalance on different 
branches of EGC. For the effect of gain unbalance on the performance of EGC, see [Bea91]. 

The error rate performance of an EGC in Rician fading channels is discussed in [Abu94], 
and is compared with that of the MRC and SC for BPSK and NCFSK signals. 

8.3.1.4 Use of Characteristic Function 

Calculation of the average BER by the above method involved two steps. First, determine 
the pdf of the required variable and then use it to obtain the average BER. The average 
BER can also be determined using a one-step procedure using the characteristic function 
(CF) of the decision variable, as in [Zha97, Zha99] 

v * = \~h] t Im K (t) ] dt (8 - 3 - 42) 

where Im[x] denotes the imaginary part of x, and \p r (t) denotes the characteristic function 
of r (the decision variable at the output of EGC), defined as 

Vr (t) = E[ei rt ] (8.3.43) 

It provides a general formula for evaluating the average BER for an EGC with coherent 
detection, and applies to arbitrary fading channels as long as their CF exists. The solution 
relies on numerical methods to estimate the integral. An algorithm to evaluate the integral 
using the Hermit method is discussed in [Zha99] for coherent detection of BPSK signals 
in Rayleigh fading channels. More discussion on the use of CFs may be found in Section 8.6. 


8.3.2 Interference-Limited Systems 

Consider a desired signal and K co-channel interferences in a Nakagami fading environ¬ 
ment with fading parameter m. The interference power I EG at the output of EGC is given by 


with 


I RG = 2 (8-3.44) 

L K 

(8.3.45) 

i=l j=l 


Then the signal power to interference power ratio Peg at the output of EGC is given by 


_ S 

Peg ~~ i 

_ r 2 ^ 
Y 


(8.3.46) 


8.3.2.1 Outage Probability 

Let IgG denote the outage probability for EGC. It is given by 

P° G = P[p EG <p 0 ] (8.3.47) 
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Substituting from (8.3.46) and using the argument in deriving (7.3.27), it follows that 


P E °G=P 


Y 




=J f r( x )J f,(y) d y dx 


Ho 


where f Y (y) denotes the pdf of Y given by 


with 


f v(y) = 


{ ^mKL 

m 


v y J 


y_ 


r(mKL) 


exp 


( \ 

my 

y j 


Q 


y 



(8.3.48) 


(8.3.49) 


(8.3.50) 


and f r is given by (8.3.31). Substituting for f r and f y (y) in (8.3.48), it becomes [Abu92] 


where 


P° = 


n=l 
n odd 


MKL-1 ( 


m 


B_ 


-COS' 


K-«ni) 


B . = Ja z 

m V r 


+ b? 


a„, = tan 


-i b ni 

a,„ 


and 



A 


i + 


1 1 
2 ; 2 ; 


n2( ° 2 ^yho 

4m 




F(i + l)nco 

f i + 1 .3.. 

na,2 SPo" 

2(m/il y p 0 ) )‘ +1 1 1 

1 + A / 2 ' 

V 

4m 

J 


The function ) in (8.3.54) and (8.3.55) can be computed as follows: 


(8.3.51) 


(8.3.52) 

(8.3.53) 


(8.3.54) 


(8.3.55) 


lFj i+f ;-a i = 


e a -2ae' 


l J -1 

■EE 


(- 2 )* 


*^(2t + l)!! 

1 t=o v ’ 


'j-* 

V t , 


|a‘, i > 1 
i = 0 


(8.3.56) 
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and 


iFi f, +1; 3 ;- a ^| = e-Jl- + al + (t-2iK3-axs-^ 1 ) al + L } (8.3.57) 

1 \ 2 J l 3 3-5 2! 3-5-7 3! j 

8.3.2.2 Mean Signal Power to Mean Interference Power Ratio 

In this section, an expression for the mean signal power to the mean interference power 
ratio Pec at the output of the EGC is derived in Rayleigh fading channels. Let Sec and I EG 
denote the mean signal power and mean interference power at the output of the EGC. Thus, 

p EG = f^ (8.3.58) 

Noting that T = S/N and Tec = S KC /LN, it follows from (8.3.11) that for Rayleigh fading 
channels (m = 1), the mean output signal power is given by 


Using the fact that 


S 


EG 


= SL 


1 + (L-1) 


71 

4 



i = 1,..., L, j = 1, 2,..., K 


(8.3.59) 


(8.3.60) 


(8.3.44) and (8.3.45) imply 


I EG = ILK (8.3.61) 

Thus, it follows from (8.3.58), (8.3.59), and (8.3.61) that the ratio of the mean signal power 
to the mean interference power at the output of EGC becomes 


M-eg 




K 


= b 


i+(l-i); 

K 


The ratio increases with the number of branches in the combiner. 


(8.3.62) 


8.4 Maximum Ratio Combiner 

In maximal ratio combining, the signals on all branches are co-phased and the gain on 
each branch is set equal to the signal amplitude to the mean noise power ratio [Jak74]. 
Thus, the branch weights are given by 

W j = a ; e j0i i = l, 2,..., L (8.4.1) 
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with 


a, = r ‘ (8.4.2) 

1 Nj 

where N; denotes the mean noise power on the ith branch. 

When the mean noise power on all branches is identical, that is, N ; = N, i = 1, 2, ..., L, 
the gain on each branch becomes proportional to the signal amplitude. The difference 
between a maximal ratio combiner (MRC) and an EGC is that in EGC, a ; = 1, i = 1,2,..., L. 

In this section, the performance of an MRC is evaluated. Both noise-limited and inter¬ 
ference-limited systems are considered [ShaOOa, Tom99, Zha99a]. 


8.4.1 Noise-Limited Systems 

In this section, expressions for the mean signal to noise ratio, outage probability, and 
average BER are derived. 

8.4. 7.7 Mean SNR 

First, consider the mean SNR at the output of the MRC. It follows from (8.5) and (8.4.1) 
that the signal envelope at the output of the MRC is given by 

L 

r = J\ ; r ; (8.4.3) 

i=l 

Thus, the output signal envelope is the sum of individual signal envelopes weighted with 
respective branch gains. Similarly, the total noise power N x at the output is given by 

L 

N T = ^N;a? (8.4.4) 

i=l 

where the mean noise power of each channel has been weighted by the branch power 
gain, namely the square of the branch gain, before summing. 

The instantaneous SNR y at the output of the combiner is given by 



-Et 


(8.4.5) 
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Thus, the SNR at the output the combiner is the sum of the branch SNRs. 

Let f V1K denote the mean SNR of the MRC. It follows from (8.4.5) that T MK is given by 


^MR — E[y ] 

L 

= £e[y.] (8.4.6) 

i = l 

= Lr 


where the last step follows from the assumption that the noise power on each branch is 
N, and T denotes the mean SNR at each branch. Thus, the mean SNR at the output of an 
MRC varies linearly with number of branches in the combiner. 


8.4.1.2 Rayleigh Fading Environment 

For the Rayleigh fading environment, first consider the pdf of the SNR at the output of 
the MRC, and then the outage probability and average BER. 

8.4.1.2.1 PDF of Output SNR 

When r ; , i = 1, 2, ..., L are Rayleigh distributed, the pdf of y may be estimated as follows. 
Consider an RV y given by 

n 

y = £xf (8-4.7) 

i=l 

where Xj denotes a Gaussian RV of zero mean and variance a 2 . The RV y has a chi-squared 
distribution with n degrees of freedom with pdf f y given by (7.1.29). Rewriting, 


f y(y) 


1 

(V2a) n T(n/2) 


--i — L - 

y 2 e 2a 


Now, using (7.1.33) and (7.1.47), (8.4.5) can be expressed as 


(8.4.8) 


T= EsjW +b ?) < 8A9 > 

i=l 

where a ; and b; denote two Gaussian RVs with zero mean and variance equal to S. 

As a J V2N, and bf V2N, i = 1,2,..., L are 2L Gaussian RVs with zero mean and variance 
equal to S/(2N) = T/2, it follows from a comparison of (8.4.9) and (8.4.7) that yhas a chi- 
squared distribution with 2L degrees of freedom. Substituting for n = 2L and a 2 = T/2 in 
(8.4.8) and noting that T(L) = (L-l)!, 


f y(Y) 


T l (L-1)! 


y L_1 e 


(8.4.10) 


Alternately, an expression for f y (y) maybe derived using CFs as discussed in Section 8.7. 
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8.4.1.2.2 Outage Probability 
The outage probability is given by 


Pmr=P[y^Yo] 

(8.4.11) 

= F,(To) 

with 

y 

Py(Y) = jfyMdx (8.4.12) 

o 

Substituting for f y (y) from (8.4.10) in (8.4.12) and carrying out the integral, one obtains the 
following expression for the distribution of y [Jak74]: 

y 

F > (T)= Jn(lhp L " <r " d >' 

0 

(8.4.13) 


(k -1)! 



The outage probability at the output of MRC in Rayleigh distributed channels is then 
given by F y (y 0 ). 

8.4.1.2.3 Average BER 

Let P') 11 ' denote the average BER at the output of the MRC. It can be obtained by averaging 
the conditional BER for a fixed SNR y over the pdf of y, that is. 


Pe MR =JP e (Y)f y (Y)dy (8.4.14) 

o 

where P e (y) is the BER for a fixed y at the output of the MRC for an arbitrary modulation 
scheme. For coherent BPSK, coherently detected orthogonal FSK, and DPSK, is given by 
[Pro95] 

P(y) = Q(y2y) BPSK (8.4.15) 

P 6 (y) = Q(Vy) CFSK (8.4.16) 

and 

f 1 n 2L-1 L-l 

p(y)= 2 e_Y L b k y 2 DPSK (8 - 4 - 17) 

^ ' k=0 
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where 


K = 


„ L-l-k 


2L-r 
n , 


(8.4.18) 


The pdf of y at the output of the MRC in the Rayleigh fading environment is given by 
(8.4.10). Substituting in (8.4.14) and evaluating the integral, the expression for the average 
BER is obtained [Pro95]. For BPSK, it becomes 


pMR 

A e 


i "|L L-l 

2(i- r ») £ 

J k=0 


L-l + k"| 
k 


J L 


1 l k 

dl+r„) 


(8.4.19) 


where 


r = i- 

0 vi+r 

For CFSK, it is also given by (8.4.19), with r 0 defined as 

r = / r 
0 i 2+r 


(8.4.20) 


(8.4.21) 


For DPSK, it becomes 


pMR _ 


XL-i^i+r) 


.b k (L —l+k)! 


k=0 


i+r 


(8.4.22) 


where b k is given by (8.4.18). The average BER for NCFSK can be obtained by replacing 
T with T/2 in (8.4.22). 

It should be noted that these results are for a slow fading environment, such that the r, 
ei 01 i = 1, 2, ..., F are constant over the bit duration. For DPSK modulation, these are 
assumed to be constant over the duration of two bits. 

Evaluating F^ 1R as a function of F and T using the above expressions, one can determine 
the effect of diversity on the average BER. 

8.4.1.3 Nakagami Fading Environment 

The average BERs for coherent BPSK and CFSK in the Nakagami fading environment with 
integer m are given by [Zha99a] 


pMR 


l_ r \ mL mL-l + k^ 

2 J LI k 

y k=0 V J 



where r 0 is defined for BPSK as 


(8.4.23) 
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(8.4.24) 


and for CFSK as 


r o = 


I r 

fm + T 


r n = 


0 A'2m + r 

Results for noninteger m and correlated fading may be found in [Zha99a]. 


(8.4.25) 


8.4.1.4 Effect of Weight Errors 

The effect of weight errors introduced by incorrect estimates of the channel gain is exam¬ 
ined in [Tom99]. Let p denote the normalized correlation between the actual complex 
channel gain Q = ttyi 9 ', i = 1, 2, ..., L and its estimate Q at some time t, with squared 
correlation given by 


E 

£ 


] 


Cp 

e J 

E[' 

c* c t ] 


(8.4.26) 


Note that p 2 = 1 corresponds to the estimate with no error. 

8.4.2.4.2 Output SNR pdf 

When estimated channel gain cq differs from the actual channel gain a„ the weights used 
in the MRC differ from those given by (8.4.1) by an error component. Assuming that these 
weight errors are complex Gaussian distributed RVs, it can be shown that the pdf of the 
output SNR in the presence of weight errors is given by [Tom99, Gan71] 


where 


y (y) = ^A(k) 


„k-l 


(k-l)!r k 


(8.4.27) 


A(k) = 


L —1 

k-1 


(1-P 2 ) 


L-k 


,2(k-l) 


(8.4.28) 


This is an interesting result and shows that the pdf of y in the presence of error is the 
weighted error of the pdf of y in the absence of error with the weighting co-efficient A(k) 
given by (8.4.28), that is. 


f Y (Y) = £A(k)f y (y) (8.4.29) 

k=l 


where f Y (y) is given by (8.4.10). The pdf of y in the presence of errors may be used to 
estimate the effect of errors on the outage probability and average BER. 
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8.4.1.4.2 Outage Probability 

The outage probability Pmr in the absence of errors is given by (8.4.11) with F y (y) denoting 
the distribution function of y in the absence of errors. Thus, it follows that the outage 
probability in the presence of errors Pmr is given by 

Pmr=F y (Yo) (8-4-30) 

with the distribution function in the presence of errors F(y) given by 



Substituting for f y (y) from (8.4.27) it becomes [Tom99, Gan71] 




/ 

£A(k)e'r 

k=l 


I 


n=l 



(n-l)! 


V 


J 


(8.4.31) 


(8.4.32) 


Note that in the absence of errors p 2 = 1, (8.4.32) reduces to (8.4.10) as only A(L) is non¬ 
zero. For p 2 = 0, when the channel estimate is completely uncorrelated with the actual 
channel parameters, only A(l) is nonzero, and the distribution function reduces to that of 
the single branch case. Hence, no diversity advantage is available [Tom99, Gan71]. 


8.4.1.4.3 Average BER 

Similarly, the effect of errors on the average BER may be obtained by replacing the pdf of 
yin the absence of errors f.,(y) with f Y (y) in (8.4.14). For this case, the average BER becomes 


P MR =JPe(Y)f y (Y)dy (8.4.33) 

0 

Substituting for f y (y) from (8.4.29) and using (8.4.14), it follows that 

P MR =JPe(Y)£A(k)f y (y)dy 

0 k=1 

L r 

= ^A(k)Jp(yX y (y)dy (8.4.34) 

k=l 0 

= £A(k)P MR 

k=l 

Thus, the average BER in the presence of errors is the weighted sum of the average BER 
in the absence of errors. 
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8.4.2 Interference-Limited Systems 

Expressions for the mean SIR, outage probability, and average BER are examined in this 
section. 

8.4.2.1 Mean Signal Power to Interference Power Ratio 

Assume that a; = a, in (8.4.2), with a, denoting the amplitude of the signal channel gain 
on the ith channel. When the mean noise power is identical on all channels, the weight 
vector for MRC can be expressed as 


w = C s (8.4.35) 

For an interference-limited system, the array signal x(t) due to a desired signal and K 
identical interferences is given by 


x(t) 


K 


k=l 


and the output of the MRC becomes 

y(t) = w H x(t) 

K 

= a/Ps C s C s g(f) + v P i IC S C Ik g k (t) 

k=l 


(8.4.36) 


(8.4.37) 


It then follows that the signal power S MK and the interference power I MK for given C s 
and C lk are, respectively, given by 


S =v (C H C 
j mr Fs\ v 's v -s; 


and 


I 


MR 


piLl c s 



and the SIR at the output of MRC, |i, becomes 


P = 


h(^s C s ) 

Dscj 


with 


P = 


Ps 

Pi 


(8.4.38) 


(8.4.39) 


(8.4.40) 


(8.4.41) 
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[ShaOOa] shows that when the desired signal is Rice distributed and the interferences 
are Rayleigh distributed, the pdf of p is given by 


f >) = < 


F 

1*1 


K + L; L; LK„ 


b + b 


k r(K+L) gL - 1 
r (K)r(L) (p+p) L+K 


(8.4.42) 


where K 0 denotes the Rice distribution parameter defined by (7.1.42). For K 0 = 0, the Rice 
distribution becomes the Rayleigh distribution, and the pdf of p for the Rayleigh distrib¬ 
uted signal case becomes 


yielding 




f(l + k) L p'- 1 
r ( L )r(K) ^ (p+p) L+K 


bMR 


= Jb^(b)db 

0 


Lp 

K —1 7 


K > 1 


(8.4.43) 


(8.4.44) 


where p MK denotes the mean value of the signal power to the interference power at the 
output of the MRC. 

For K 0 = °°, the desired signal becomes nonfading and the pdf for the nonfading signal 
case becomes 


yielding 


f » 


r(K) 


p- K - J 



bMR 


= Jb^(b)db 

0 


Lp 

K^l' 


K > 1 


(8.4.45) 


(8.4.46) 


Thus, the mean signal power to interference power is the same for both cases. 


8.4.2.2 Outage Probability 

The outage probability is defined as 


p°=p[b<b 0 ; 


^0 

J f ,(b)dp 


o 


(8.4.47) 
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For MRC, when the signal envelope has a Rayleigh distribution, (8.4.47) and (8.4.43) yield 
[ShaOOa] 


P° = 

x TV/TO 


t(l + k) 
r(L+i)r(K) 


A 


V H j 


F 

2 1 1 


L + K, L;L + 1;-^ 
V h 


(8.4.48) 


where 2 Fj(a; b; c; x) denotes the Gauss hypergeometric function defined as [Abr72] 


with 


2 F i(a,b;c; x) = ^ 


Wn = 


( a )n( b )n X n 

h (°)n n! 

r(x + n) 
r(x) 


When the signal envelope is nonfading, (8.4.45) and (8.4.47) yield 

_ r(K,Lp/p o ) 

MR F ( K ) 


(8.4.49) 


(8.4.50) 


(8.4.51) 


where T(K, x) is the incomplete gamma function. 

8.4.2.3 Average BER 

Assuming that the interference term in (8.4.37) is Gaussian distributed, the conditional 
P e (|i) for coherent BPSK is given by [ShaOOa] 


P>) = Q(y2p) 


= 2 ertc/p 

The average BER may be obtained by averaging all values of p (the SIR) as 


(8.4.52) 


P MR =jp(p)f M (p)dp (8.4.53) 

0 

Substituting for P e (p) from (8.4.52) and f u (p) from (8.4.43) in (8.4.53), P^ 1K for the Rayleigh 
fading environment becomes 


pMR 


t(l+k) 

2r(L)r(K) 



Evaluation of the integral leads to [ShaOOa] 


(8.4.54) 
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pMR _ 


2 Vjtr(L)r(K) 

1 

2 

+ r(K)r(L)r| 


A L r|--K|r(L + K) 


(-K) 


,F 2 | L + K, K;K + -,K + l;n 


1 1 3 


r l l r l L + a 12 F zl l+ 2 ' 2' 2~ K; ^ 


(8.4.55) 


where 2 F 2 O is a hypergeometric function. An expression for the generalized hypergeo¬ 
metric function is given by [And85] 


with 



*)=£ 

n=0 


KU 1 

! a p) 

l X" 

( b l)n L 1 

k; 

1 n! 
n 


(8.4.56) 


(x) 


n 


r(x + n) 

r(x) 


(8.4.57) 


8.5 Optimal Combiner 

An optimal combiner (OC) or beamformer as discussed in Chapter 2 maximizes the output 
SNR at the output of the combiner, and is useful in canceling unwanted interferences in 
nonfading and uncorrelated environments when the system has more degrees of freedom 
than the number of interferences present. In mobile communications, the situation is 
different than that assumed in Chapter 2. There are generally more co-channel interfer¬ 
ences than the number of elements in the array; these interferences may not be as strong 
as the desired signal and fading conditions prevail. Under these conditions, the OC is not 
able to fully cancel all interferences. However, it is able to achieve performance improvement 
by combating the effect of fading and causing some reduction in the power of co-channel 
interferences entering the receiver [Win84, Win87, Win87a]. 

In this section, OC performance is examined when there are more co-channel interfer¬ 
ences than the number of elements in the array in fading conditions. Expressions for the 
average BER and the probability of errors are derived using the procedure presented in 
[Sha98, ShaOO]. For analytical simplicity, it is assumed that all interferences are of equal 
power and that the system is interference limited. Thus, the effect of noise is ignored. 

Let w cic denote the weights of the OC given by (2.4.1). For an interference-limited system 
when the effect of noise is ignored, the noise-only array correlation matrix R N in (2.4.1) is 
identical to the correlation matrix due to interference R r . Thus, w (X can be estimated using 

w oc = « 0 Rr 1 C s (8.5.1) 
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where a 0 is an arbitrary constant and the steering vector in the look direction has been 
replaced by the signal channel gain vector. Let R, be estimated using 

K 

(8.5.2) 

j=i 

with pj denoting the power of each interference, and Qj denoting the channel gain vector 
for the jth interference. 

It can be shown [Gir77] that for K > L, Rj 1 exists with probability one if 

r= E l c „ c ?] 

is positive definite. Thus, it is assumed here that Rj 1 exists. 


8.5.1 Mean Signal Power to Interference Power Ratio 

Let Soc and Iqc denote the signal power and the interference power at the output of OC, 
respectively. These are given by 


S nr = w H R q w (8.5.3) 

and 

I OC =W oc R I W oc ( 8 - 5 ' 4 ) 

where 

R s = P S C S C S (8.5.5) 

denotes an estimate of the signal array correlation matrix. 

Substituting for R s and w ro it follows that 

S oc = «oPs(CsRi 1 C s ) 2 (8.5.6) 

and 

I OC =«o(Cs'R( 1 C s ) (8.5.7) 

Let p denote the signal power to interference power at the output of the OC. Thus, it 
follows from (8.5.6) and (8.5.7) that 



= Ps(c”Ri 1 c s ) 


(8.5.8) 
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In Rayleigh fading channels, the pdf of p, f M (p) is given by [Sha98] 




r(K + l)(p) K+1 ~ L pL-! 

r(L)r(K+i-L) (p+p) K+1 ' 


p > 0,1<L<K 


and the mean value of SIR at the output of the OC p (K becomes 


(8.5.9) 


boc = E M = Jbf>)dp 

0 (8.5.10) 

= —!^p, 1 < L < K 
K-I/ 

where 


p = ^ (8.5.11) 

Pi 

Note that for K = L, E(p) does not exist. It follows from (8.5.10) that for K @ L, the mean 
SIR is proportional to number of branches in the combiner. 


8.5.2. Outage Probability 

The outage probability P" is given by 


P°c = F>o) 

^0 

= J f H (h)dp 


(8.5.12) 


Substituting for f u (p) from (8.5.9), and evaluating the integral (8.5.12) becomes [Sha98] 


P° = 


T(K + 1) 



L 

2 F J 

( 

K + l, I 

i 

r—i' 

+ 

J 

U J 


V 



r(L + l)T(K + l-L)l 
where 2 F|(a, b; c; x) is a hypergeometric function given by (8.4.49). 


(8.5.13) 


8.5.3 Average Bit Error Rate 

In this section, an expression for average BER for BPSK signals is presented in a slow 
Rayleigh fading environment employing a coherent receiver. The computation of average 
BER requires the distribution of interferences at the array output. When interferences are 
not identically distributed, the central limit theorem could not strictly be invoked to claim 
the Gaussian property for the sum of the binary RVs, but for large numbers of interferences 
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the Gaussian property is often assumed for such analyses and the results presented here 
are under this assumption. 

The conditional BER (for a given p) for a coherent BPSK is given by (8.4.52). The average 
BER is then obtained by averaging all values of p. Thus, 


p°c _ 


2 J erfc \/b f » d h 


(8.5.14) 


Substituting for in (8.5.14) from (8.5.9) and evaluating the integral [Sha98], 


p°c _ 


2 Vrtr(L)r(K + l-L) 


ih-ic-iWi) 


(L-K-l) 


2 F 2 |K + 1, K + l-L;K-L + ^,K-L + 2;p 


+ £K K - L 4H4H L 4W L 4I ;L - K 4f ;ii 


(8.5.15) 


+ r(K + l-L)VjcT(L)] 

where 2 F 2 (.) is a hypergeometric function given by (8.4.56). 


8.6 Generalized Selection Combiner 

A conventional selection combiner discussed in Section 8.1 selects the signal from a branch 
with the strongest signal, normally with the largest instantaneous SNR. A generalized 
selection combiner (GSC), on the other hand, selects strongest signals from more than one 
branch and combines these selected signals coherently using an MRC or an EGC. In 
maximum ratio combining, the selected signals are combined coherently with a gain 
proportional to the amplitude of the signal received on respective branches as discussed 
in Section 8.4, whereas in equal gain combining the gain on each branch is the same as 
discussed in Section 8.3. 

Thus, a GSC is a two-stage processor as shown in Figure 8.2, where the first stage selects 
the L c strongest signals from L branches and the second stage combines these using an 
MRC (or EGC). It becomes an MRC (or EGC) when all branches are selected at the first 
stage, and becomes an SC when only one branch is selected. 

Since the MRC and EGC select all branches, including those with poor SNR that provide 
only a marginal contribution to the information as well as a possible source of errors. 
Thus, a GSC is expected to be more robust than the MRC and EGC in the presence of 
channel gain errors. On the other hand, an SC only processes one branch, and thus may 
be losing too much information in the process. Thus, an GSC is expected to offer advan¬ 
tages over an SC. For more details on the GSC and its performance under various noise 
environments, see [AloOO, Eng96, Kon98, Roy96]. 
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FIGURE 8.2 

Block diagram of a generalized selection combiner. 

In this section, the performance of a GSC is studied and the use of moment-generating 
functions and the CFs to evaluate various performance measures is explained. 


8.6.1 Moment-Generating Functions 

The moment-generating function (MGF) of an RV x, M X (S), is defined as [AloOO] 


M x (S) = E[e Sx ] 

= } e Sx f x (x)dx 


( 8 . 6 . 1 ) 


It is related to the Laplace transform of the pdf of x, f x , by 


L(4) = M X (-S) 


( 8 . 6 . 2 ) 


and thus one is able to obtain the pdf of x from M x by taking the inverse Laplace transform 
on both sides of (8.6.2). 

The MGF is related to the CF of x, \p x (jco), by [Pro95] 

V x (H = M x (S)| s=jra (8.6.3) 

and the cumulant function of x by [Abr72] 

**(S) = ln(M x (S)) (8.6.4) 

The mean value of x may be obtained from (|) x using 


or alternately. 


x = 


dUS) 


dS 


X = -J 


dco 


(8.6.5) 


( 8 . 6 . 6 ) 


© 2004 by CRC Press LLC 





The use of CF to evaluate average BER is discussed in Section 8.3. 

Now the performance of an L-branch GSC is examined by deriving expressions for the 
mean output SNR, outage probability, and average BER when it selects L c signals from 
branches with the largest instantaneous SNR in a noise-limited Rayleigh fading environ¬ 
ment and combines them using an MRC. 


8.6.2 Mean Output Signal-to-Noise Ratio 

Section 8.4 shows that the SNR at the output of the MRC is the sum of the SNR at each 
branch. Let y(l) denote the ordered SNR at the 1th branch, such that y(l) > y(2) > ... > y(L). 
Thus, it follows that the SNR at the output of the GSC is the sum of the SNR at selected 
branches. Hence, 


Yes ~ 


i> 


(8.6.7) 


It should be noted that even when y, 1 = 1, 2, ..., L are i.i.d. RVs, RVs y(l), 1 = 1, 2, ..., L c 
are not i.i.d. RVs [AloOO]. 

The mean SNR at the output of the GSC, Teg, is then given by 


r GS E[y GS ] 


( 8 . 6 . 8 ) 


where y(l) is the mean value of the ordered SNR on a selected branch, and is given by 


y(i)=J yf T( i)(y)dy (8.6.9) 

0 

with f y(1) denoting the pdf of the ordered SNR y(l). 

When y„ 1 = 1, 2, ..., L are i.i.d. RVs, f y(l ) can be expressed in terms of f y , the pdf of the 
unordered SNR. It is given by [AloOO] 

W y) = (L-lMl-iy iWf’NWrtM (8-6.10) 


where F y is the cdf of y. 

For Rayleigh fading channels, f y and F y are, respectively, given by (7.3.12) and (7.3.11). 
Substituting in (8.6.10), 


L! 


f y( i )(' y ) r (L-i)i(i-i)! 


(l-e-^) L_1 e-V r 


( 8 . 6 . 11 ) 


Substituting this in (8.6.9) with u denoting e~ 7;/| , the expression for y(l) becomes 
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-lit 


y(!) 


l 


(L-l)'(l-l)! 


J(l-u) l "'u 


In udu 


L-l 


= r L 

k=0 


1 

1+k 


( 8 . 6 . 12 ) 


which, along with (8.6.8), yields 


- r EE 


l 

l+k 


1 + 

v 



LJ 


(8.6.13) 


Details on the evaluation of (4.6.12) and (4.6.13) are provided in [AloOO]. A derivation 
of (8.6.13) is also provided in [Roy96]. It can easily be seen that for L c = 1, (8.6.13) reduces 
to (8.1.9), an expression for r^. For L c = L, it becomes (8.4.6), an expression for r MK . Thus, 
(8.6.13) is a generalization of the two results. It can be shown that r GS is a monotonically 
increasing function of the selected branches, resulting inTsc^rcg^rMR [Kon98]. Thus, 
the mean SNR of the GSC is bounded below by Tgc and above by T VIK . 

An alternative derivation of (8.6.13) using moment-generating functions is presented 
below. An expression for M ycg (S), the MGF of y (:s in the Rayleigh fading environment, is 
given by [AloOO]: 


M YGS (s) = (i-sr)- L = + 1 ^[i- SI ^ 

1=0 

Alternately, in summation form the expression becomes 




Ygs 


(s) = (i-sr)- L ' + 1 £(-i ) 1 

1=0 




Ucj 

l 1 J 


i+—-sr 

L 


(8.6.14) 


(8.6.15) 


Substituting for M ycs from (8.6.14) in (8.6.4), one obtains an expression for the cumulant 
function: 


4> Tgs (S) = (-L c +l)ln(l-sr)- 



= -L c ln(l — Sr)- 



(8.6.16) 


which, along with (8.6.5), yields (8.6.13). 
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To derive the pdf of y^ rewrite (8.6.14) by replacing 1 with 1 + L c to shift the range of 
product terms from 1 = ..., L to 1 = 0, ..., L - L c : 


M 


Ygs 


(s) = (i-sr)' Lc+i rt l 


1=0 

L-L, 


STL c 

1+L. 


(i-sr) Lc+ 1 F] 


1+L, 


v L , 

1=0 V C J 


f 


\ 


-1 


1 —--ST 
L 


(8.6.17) 


L! 

L-L, 


L !L 

C C 


-(i-sr)" Lc+1 ] _ [ 1-—-sr 

1=0 V L c J 


J 

V 1 


The required f yGs may be obtained by substituting M, (;s from (8.6.17) in (8.6.2), and then 
taking the inverse Laplace transforms. An expression for f ycs becomes [Roy96] 




a,y 1 e r 

r 1+1 u 



(8.6.18) 


where a, are the coefficients of the partial fraction expansion of (8.6.17) and are given by 

(8.6.19) 



Ml 

L-L c 

(\ T 1 

a L c - 1 - 1 _ 

J —i ' 

L J 

III 

JL — JL 

c 

k k 


V C/ 

k=l 

\ J 


(-l) k+1+1 Li c 
k 1 


and 


b k = 



hJ 

1 

hJ 


l k j 


(- 1 ) 


L c +k^L c -l 


The expression also has a compact form given by [AloOO]: 


f Y CS W = 


v L cy 


y L c-l e -T/ r 

T Lc (L c -l)i 


L-L c 

F EH ) 1 


L-L, YL 


- T /r 


e n ‘ c - 


1 -ly 


m = 0 ' m! l L c r , 


( 8 . 6 . 20 ) 


( 8 . 6 . 21 ) 


8.6.3 Outage Probability 

Let Iq 3 denote the outage probability for GSC. It is defined as 

Pgs = F Tgs (Yo) (8-6-22) 
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An expression of F., ( (y) may be obtained by integrating (8.6.18) [Roy96]: 


where 


and 


L c -1 


1=0 


A.y'e r 

r 1 ! 



k=l 


e 


1 + 




y 

r 


(8.6.23) 


A,= 


E* 


(8.6.24) 


„ Lb. 

B. = c k 

k k + L 


(8.6.25) 


An expression for Pq S follows from (8.6.22) and (8.6.23). A closed-form expression for 
Pcs is obtained by integrating (8.6.21) and using (8.6.22) [AloOO]: 


P° = 

1 GS 


fL\ 


_lo 

1-e r 




u 

hJ 

1 

fM 

1—1 

1—1 


( 


y bo 


L c-2/ \ 

... m 


l-e v Lc J r 
1 


V 


_Vo , 
l-e r 


k=0 


lo 

r 

k! 




)) 


(8.6.26) 


8.6.4 Average Bit Error Rate 

The average BER for the GSC can be obtained by averaging the conditional BER for a 
given y over all y using an expression for f . Using (8.6.18) and (7.3.55), an expression 
for the average BER for coherent BPSK becomes [Roy96] 


p GS = 


rl 

1=0 


2U 

1 


A x r 2 

4(r+1) 1 


B k r 2 


2 r+i+ 


k V 


(8.6.27) 


Replacing T by T/2 in (8.6.27), an expression for the average BER for CFSK is obtained. 

The MGF may also be used to directly derive expressions for the average BER. For 
various modulation schemes, the average BER using the MGF is given in [AloOO]. For 
coherent BPSK and CFSK, 


n 



BPSK 

CFSK 


(8.6.28) 


© 2004 by CRC Press LLC 



Substituting for M yGS from (8.6.15) and carrying out the integral. 


pGS _ 


L-L c (“I) 

I 


a-L 


V 1 y 


h 


,L J.1 

v cy 1=0 1 + — 


-■ eT -§0 
2 1 + 1 

L 


where 


In ( 6 ; O, c 2 )=l 


f f Sin 2 (|) 'j 

Y Sin 2 (|) > 

[IsinVcJ 

.Sin^ + O, 


'c J 


d(|) 


and can be evaluated using the following equations[AloOO]. 
For Q = C 2 = C, 


(8.6.29) 


(8.6.30) 


with 


U 0 ;O=-f- 

n 


l + Sgn(0-7t) , T) ( C 
' 


71 JM i+c 


k=0 


[4(1 + C)f 


2 ; c 


n-i K-i 

II 


71 v 1 + C {-**-*[ j j 

k=0 j=0 V J J 


2k^ 


(-l) l+k Sin[(2k-2j)T] 

[4(1 +Of 2k - 2 l 


, o<e<27i 


T = —arctanf—)+ — -SgnNlf 1 + SgnP 

2 IdJ 2 & l 2 J 


(8.6.31) 


(8.6.32) 


\ 2 C( 1 i C) Sin(20) 

D = (1 + 2C)Cos(20) -1 


and sgn(x) denoting the sign of x. 
For Q + C 2 , 

I„(0; C 1 ,C 2 ) = I n (0; o) 


1 + Sgn(0 - k) T, 
2 


C, 


K J]] 1 + C 2 


c. 




vO Oy 


l + Sgn(0-7t) | T t 

v 2 71 / 


O ^ k 


O+O^vO-c 


-iy 


(8.6.33) 

(8.6.34) 


Wnc,)]- 


(8.6.35) 


2 O ^ 


""Y 2kh (_lf k Sin[(2k-2j)^] 

v j JfO + COf 2k “ 2j 


7T ^ 1 + Q 
0 < 0 < 27: 

with T x and T 2 corresponding to T of (8.6.32), with C replaced by Q and respectively. 
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The expression for the average BER for BPSK (g = 1) given by (8.6.28) yields the same 
numerical results as given by [Eng96] for L c = 2 and L c = 3 using the conventional approach 
of averaging the conditional BER over all values of y. These expressions are given below. 
For L c = 2, the average BER F®p for coherent BPSK is given by 


pSC2 

A e 


L ( L ~ 1) 
2 


1 

2 


1- 


1_ 

v 1 + cc 


a 


2(1 + a)yl+a 


L-2 

v k , 
k=l V J 


(-l) k V(k) 


(8.6.36) 


with 


V(k) = 


1 

2 + k 


1_ 

kVT-i-oc 


_ 2 _ 

k(2 + k)Ju^p> 


(8.6.37) 


and 


a = 


1 

r 


(8.6.38) 


For L c = 3, the expression is given by 


, SC 3_L(L-l)(L-2) 




t-p^ 


2 / 2kVl-p 2 ^ 


—. k , 

k=0 V J 


V ** J 


— ^L-3^ 


L (-tgw 


k=l X J 


(8.6.39) 


with 




1- 


1+a 1 + 

i l 3 


1 3 


(8.6.40) 


and 


h = 


Ali+r 


(8.6.41) 


8.7 Cascade Diversity Combiner 

A cascade diversity combiner (CDC) is similar to the GSC discussed in the previous section 
in that it employs a two-stage diversity combining [Roy96]. Elowever, there are some 
differences. The CDC divides the L branches in L c groups of M branch each, and then 
uses a selection combiner to select one best signal from each group at the first stage. At 
the second stage, it uses an MRC to combine the L c signals selected at the first stage. The 
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Branch 1 
Branch 2 


Branch L 



FIGURE 8.3 

Block diagram of a cascade diversity combiner. 


L c selected signals at the first stage are not necessarily the best L c signals, as is the case 
for the GSC. However, the combiner is perhaps easy to implement and analyze, as the 
SNR at different branches may be assumed to be i.i.d. RVs. In addition, having equal 
numbers of M inputs in different selection combiners helps in implementation. For M = 
1, there is no selection; it is equivalent to an MRC. For L c = 1, there is no combiner and it 
is equivalent to a conventional SC. Figure 8.3 shows a block diagram of a predetection 
CDC. 

In this section, an analysis of a CDC is presented. Both Rayleigh fading and Nakgami 
fading environments are considered [ChoOO, Roy96]. 


8.7.1 Rayleigh Fading Environment 

First, consider the pdf of Yo> the SNR at the output of the CDC. 

8.7.1.1 Output SNR pdf 

It follows from (8.1.7) that when SNRs on different branches are i.i.d. RVs, the pdf of the 
SNR at the output of an M branch selection combiner in a Rayleigh fading environment 
is given by 


f T (y) 



if 

r i 

V 



The characteristic function of y is given by 


(8.7.1) 


v|/ Y (ico) = J e '" r T( Y )cly, y^O (8.7.2) 

o 

Substituting for f y from (8.7.1) and carrying out the integral, it becomes 
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M! 


(8.7.3) 


¥ Y (j«) = ^r 


The second stage uses an MRC to combine the L c signals. As the SNR at the output of 
the MRC is the sum of individual SNRs, it follows that the instantaneous SNR Ycd at the 
output of the CDC is given by 


Ycd 




(8.7.4) 


where y denotes the SNR at the output of the 1th SC. 

Since y, 1 = 1, ..., L c are i.i.d. RVs, it implies that characteristic function of y CD is the 
product of individual characteristic functions, that is. 




M! 


M 


n(M-) 


(8.7.5) 


One observes from (8.7.5) that \|/ ycD (jto) has M poles of order L c each. Thus, it can be 
expressed in summation form suitable for inverse transformation to obtain pdf of Ycd as 
follows: 


¥ 


Ycd 


M L c -1 

(]t 0 ) “SS( k-jcor ) 1 


(8.7.6) 


where a kl are the coefficients of the partial fraction expression of ¥ TcD (j< a )- For a technique 
to compute these coefficients, see [Gil81]. 

Taking the inverse transformation of (8.7.6), the pdf of Ycd then becomes [Roy96] 


M U-1 


^Tcd 




L-t Lmj ]jr 1+1 

k=l 1=0 


(8.7.7) 


8.7.1.2 Outage Probability 

Let P f ° ^ denote the outage probability for a CDC. Integrating (8.7.7) yields 


M L.-l 


Ycd 


m = i -LL 


k=l 1=0 


A. , y -kT 
-J^e r 

ur 1 


(8.7.8) 
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where 


A, , = 


k,l+i 


kA Lm! k‘ +1 

i=0 


(8.7.9) 


An expression for outage probability is obtained by substituting y 0 for y in F 

P 7°cd =F YcdW 


Ycd' 


(8.7.10) 


8.7.1.3 Mean SNR 

The mean value of SNR at the output of the CDC can be obtained using (8.6.3) to (8.6.5) 
with S = jco, that is. 


^cd 1 


d KJtl 


dco 


co=0 


where 


V0 C °) = ln ^Ycn0 CO ) 


Substituting for \\r (jco) from (8.7.5) in (8.7.12), 


'Ycd 


(jco) = L c ln 


M! 


n^) 

k=l 

M 

: L c In M!-L c ^ ln(k - jcor) 


(8.7.11) 


(8.7.12) 


(8.7.13) 


Differentiating on both sides of (8.7.13) with respect to <o yields 


d V0 (d ) _ L £ jr 


dco 


Using this in (8.7.11) it follows that 


T =1 r 
CD Ld Z^k 

k=l 


^(k-jcor) 


m . 

Li 


(8.7.14) 


(8.7.15) 


M 

Thus, the mean SNR increases by from the single branch mean SNR. In fact, the 

k=l K 

gain in mean SNR is the product of the gain by an M branch SC and an L c branch MRC. 


© 2004 by CRC Press LLC 



8.7.1.4 Average BER 

The average BER may be obtained by averaging the conditional BER over all values of y, 
that is, 

Pe CD =JPe(Y)f ycD (Y)dT (8-7.16) 

0 

Consider an example of a coherent BPSK system. Substituting for P e (y) from (7.3.55) in 
(8.7.16), 

° (8.7.17) 

=KmJ- ~i=- du dy (substituting for erfcv y j 

o Vy 

Changing the order of integration, it becomes 


pCD 

1 e 



du 


Using (8.7.8), the formulas 


and 


J u a e _bu du = 
0 



a+1 


2b 2 


2 2x “ 1 r(x)r|x+£) - Vttr(2x) 


and carrying out the integral [Roy96], the expression becomes 


(8.7.18) 


(8.7.19) 


(8.7.20) 


pCD 


1 

2 


it 21 '' 


k=l 1=0 


A k/1 r 2 t 

[4(r+k)] 1+ Y 


(8.7.21) 


Now consider a case of M = 1 and L c = L. In this situation, the CDC becomes an MRC. 
It follows from comparing the terms in (8.7.5) and (8.7.6) that 


Jl 1=L-1 
1 [0 otherwise 


(8.7.22) 
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Thus, from (8.7.9) using L c = L and (8.7.22): 


A 1 = 1, 1=0,..., L-l 


(8.7.23) 


Substituting this in (8.7.21), 


pCD 

A e 



(8.7.24) 


It is left an exercise for the reader to show that this is the same as (8.4.19). 

Note that using M = 1, L c = L, (8.7.22) and (8.7.23) in (8.7.7) and (8.7.8), respectively, 
leads to (8.4.10) and (8.4.13). 

Now, consider an example DPSK system. For DPSK, P e (y) is given by (8.4.17). Substi¬ 
tuting (8.4.17) and (8.7.7) in (8.7.16) and carrying out the integral [Roy96], 


L c —1 L c -l-i L c -1 L c -1 

*”-EEEE 

i=0 j=0 k=l 1=0 


p L c -n 

f i+i) 


l 1 J 

i—1 

2 2L c-i(r+k) i+1+1 


(8.7.25) 


where a kl are the same as in (4.7.6). The above result also applies to noncoherent orthogonal 
FSK when T is replaced by T/2. 


8.7.2 Nakagami Fading Environment 

In the Nakagami fading environment with the fading parameter m acquiring integer values, 
the pdf of the SNR at the output of an L branch selection combiner when signals on all 
channels are i.i.d. RVs is given by (8.1.20). The characteristic function of Ysc i s given by 


¥ yscO®) = J ei “ Yf y S cW dY ' Y -° (8.7.26) 

o 

Substituting for f YSC (y) from (8.1.20) in (8.7.26) and evaluating the integral, the CF of Ysc f° r 
an M branch selection combiner becomes [ChoOO] 


v ysc (H= 


m 


M 

r(m). 


Lfr 1 )- 1 ''! 

i=0 v ' jeB 


Cji djjfCjj + m-l)l 

’ 4^ (8-7.27) 


A. 


m(i + l) 


+ jco 


The characteristic function of Ycd is the product of the individual characteristic functions; 
thus, it is given by 

(8.7.28) 


vycdO^hKj®)} c 


Following a procedure similar to the Rayleigh fading case described in Section 8.7, the 
pdf of the SNR of the cascade receiver is given by [ChoOO] 


M-l L'-l 

f Yc D ( Y )=LIX re 

i=0 1=0 


-m(i+l)T 


(8.7.29) 
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where 


1/ = (maxjc. 1 + m)L 

l 1 


(8.7.30) 


and a y are the partial fraction coefficients. 

8.7.2.1 Average BER 

Consider an example of differential QPSK [ChoOO]. The conditional BER for an L c branch 
MRC using the differential QPSK in AWGN channels is given by 


where 


P(y) e^£[ 2 l]"l,.( 2 y) 

n=l 

+ e- 2 n 0 (,2y)R Lc 

Lc- 1 

n=l 


R, 


1 

22L-1 



2L-K 

i , 


(8.7.31) 


(8.7.32) 
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n,L 


1 

22L-1 


L-l-n 
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j=0 


"2L-f 

/ ,— \n / i — \n 

v i > 

( \2 +l) -(a2-1) 

- 


and I n (.) is the modified Bessel function of the first kind and order n. 
The average BER then becomes 


(8.7.33) 


P = 


|p e (Y)f YcD (y)dy 

0 

M-l L'-l 

H a u rltl G( m TT) 

=0 1=0 
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-X 
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+ > R_ 


(n+l)! 
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x — 1 


x + 1 


F -1; 1+1; n + l; 


1 — x 
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1 — x 


(8.7.34) 
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where 


G(m, i, r) = m 2 (i +1) 2 + 4rm(i +1) + 2r 2 


(8.7.35) 


m(i +1) + 2r 


(8.7.36) 


G(m,i,r) 


and F(a, b; c; x) = ^(a, b; c; x) is a hypergeometric function given by (8.4.49). 


8.8 Macroscopic Diversity Combiner 

The signal envelope undergoes fast fluctuations due to local phenomena, and superim¬ 
posed on these fluctuations is a slow varying mean signal level due to shadowing as 
discussed in Chapter 7. The fast-varying signal components received on spatially sepa¬ 
rated antennas may be regarded as uncorrelated with antenna spacing of the order of half 
a carrier wavelength. However, this is not the case for the slow-varying mean levels. The 
various space-diversity techniques discussed in previous sections required independent 
fading components. These space-diversity techniques are normally referred to as microdi¬ 
versity techniques, and are only useful in combating the effect of fast fading. 

A space-diversity technique referred to as macrodiversity is employed to overcome the 
effect of shadowing. In macrodiversity, a cell is served by a group of geographically 
separated base stations, and a base station receiving a strongest mean signal is used to 
establish a link with a mobile [Tur91, Abu94b, Abu95, Jak74]. 


8.8.1 Effect of Shadowing 

In this section, the effect of shadowing on the performance of a system using a microscopic 
selection combiner and microscopic maximal ratio combiner schemes in the Rayleigh 
fading environment is considered [Tur91]. 

8.8.1.1 Selection Combiner 

Let f ysc denote the pdf of the SNR at the output of a system using L-branch SC for a given 
mean SNR level, and let f r denote the pdf of the mean SNR at the site employing the SC 
system. Let P e (y) denote the BER for a particular modulation scheme for a SNR y. Then 
the BER at the output of the SC system is the average over all values of the SNR given by 



( 8 . 8 . 1 ) 


o 


This quantity is dependent of the mean SNR T. When T is not constant, the average of all 
T needs to be carried out to evaluate the average BER. It is given by 
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( 8 . 8 . 2 ) 


P e=| P e( r ) f r( r ) dr 

0 

An expression for f ysc in the Rayleigh fading environment is given by (8.1.7). Rewrite in 
the following from 




k=l 


L\ 

V k y 


-exp 


ky 

r 


(8.8.3) 


Substituting this in (8.8.1) gives P e (r), and P e then may be obtained using the pdf of T in 
(8.8.2). The pdf of T has a log-normal distribution. It follows from (7.1.23) that it is given by 


f r ( p ) 


10 

ory27tln(10) eXP 


{10 logr-rj 

2o 2 


(8.8.4) 


where T d is the mean value of T in decibels and o 2 is its variance in decibels. If you know 
the BER for a particular modulation scheme, the average BER can be calculated using 
above procedure. 

Consider an example of the CFSK scheme. For CFSK, P e (y) is given by (7.3.57), that is, 

P(y) = |erfc^ (8.8.5) 

In [Tur91], the average BER for a minimum shift keying (MSK) receiver is derived using 


M l- 

p (y) = — y erfcjd? 7 (8.8.6) 

eU; 2M L-i '2 v ’ 

i=l 

For M = 1 and d, = 1, (8.8.6) reduces to (8.8.5). Thus, the results derived for MSK reduce 
to that for CFSK when M = 1 and dj = 1. 

Substituting (8.8.3) and (8.8.5) in (8.8.1) and carrying out the integrals [Tur91], 


P e ( P ) = L (- 1 ) 1 


k=0 


V p J 


1 


2 /fl^ 

A p 


(8.8.7) 


which, along with (8.8.4) and (8.8.2), results in the average BER in Rayleigh and log-normal 
fading. 


P 

k=0 



1 5 f 1 

(ioio g r-r d ) 2 

Iky 
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2o 2 


dr 


( 8 . 8 . 8 ) 
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8.8.1.2 Maximum Ratio Combiner 

Now, consider the MRC under a similar environment. The SNR pdf at the output of the 
MRC is given by (8.4.10), that is. 


(y) 

Ymr '' l 


y'-'e 


T l (L-1)! 


(8.8.9) 


Using (8.8.9) and (8.8.5), the average BER for a given mean SNR becomes 


Pe(r) = Jp e (y)f yMR (Y)dy 
0 

f n _ (8-8.10) 

1 ^ ^L-k+ 2 jyr/2K 

2 hi 2(L-k)!(l + 0.5r) L “ k 4 

and the average BER after taking shadowing into consideration using (8.8.4) becomes 
[Tur91] 


Pe=JPe(P)fr(P)dr 

0 


1 
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L 



5r^L-k+|) 

ln(10)jio(L-k)! 


f Vf 

* (ioio g r-r d ) 2 ' 

1 ex p 

J 0 (i+r/ 2) L ' k+ 2 

2<r 


dr 


(8.8.11) 


8.8.2 Microscopic Plus Macroscopic Diversity 

Figure 8.4 shows a block diagram of a composite microscopic-plus-macroscopic diversity 
system in which transmission from a mobile is received by N different base stations. Each 
station employs an L-branch microscopic diversity system, which may employ any of the 
diversity-combining techniques discussed previously, and produces one output per base 
station. Thus N base stations produce a total of N outputs. A macroscopic diversity scheme 
is then used to produce one output. In principle, the macroscopic diversity scheme may 
use any one of the previous diversity-combining schemes to produce one output from N 
branches. 

In this section, a scheme in which a selection diversity is employed to select one of the 
N branches is analyzed [Tur91]. Assuming that the signals on N branches are log-normally 
distributed, the pdf of the N-branch selection-diversity scheme is given by 


U r ) = 


ION 


oT ln(lO)V2jt 


exp 


(ioi°gr-r d ) 2 

J ioio g r-r d ^j 

2a 2 

l ° J 


( 8 . 8 . 12 ) 


where F(.) is the cumulative normal distribution function. 
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FIGURE 8.4 

Block diagram of a macroscopic diversity combiner. 

The average BER to include the shadowing effect may be calculated by averaging the 
conditional BER at the output of microscopic diversity combiner, that is. 


Pe = JPe(r)f rsD (r)dr (8.8.13) 

0 

where P e (r) denotes the average BER at the output of microscopic diversity combiner for 
a given mean SNR. For CFSK system operating in the Rayleigh fading environment, for 
SC and MRC, it is given by (8.8.7) and (8.8.10), respectively 
Let Pf M and P^ TKXI denote the average BER when a composite system uses SC as mac¬ 
roscopic diversity with SC and MRC as microscopic diversity, respectively Using (8.8.7) 
in (8.8.13) along with (8.8.12) yields [Tur91] 
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(8.8.14) 


dr 


and using (8.8.10) in (8.8.13) yields [Tur91] 
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(8.8.15) 
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Notation and Abbreviations 


AWGN 

BER 

BPSK 

CDC 

CFSK 

DPSK 

EGC 

GSC 

MGF 

MRC 

NCFSK 

oc 

cdf 

pdf 

RV 

SC 

SDC 

SIR 

C s 

C ij 

Fy 

f y 

fy 

\ 

i(X 

^EG 

Imr 

I) 

I 

n 

K 

L 

Lc 

L(f) 

M x 

m 

N 


additive white Gaussian noise 

bit error rate 

binary phase shift keying 

cascade diversity combiner 

coherent orthogonal frequency shift keying 

differentially binary phase shift keying 

equal gain combiner 

generalized selection combiner 

moment generating function 

maximum ratio combiner 

noncoherent orthogonal frequency shift keying 

optimal combiner 

cumulative distribution function 

probability density function 

random variable 

selection combiner 

switched diversity combiner 

signal power to interference power ratio 

channel gain vector for signal 

channel gain vector for jth interference 

cdf of y 

pdf of y 

pdf of y in weight errors 

total interference power 

total interference power at the output of OC 

interference power at the output of EGC 

interference power at the output of MRC 

mean power due to jth interference, identical on all branches 

mean interference power due to identical interferences on all branches 

instantaneous power on ith branch due to jth interference 

number of interferences 

number of branches 

number of selected branches 

Laplace transform of f 

MGF of x 

Nakagami fading parameter 
uncorrelated noise power 
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Hi 

n(t) 

P() 

Pe 

Pe(Y) 

If 

P? 

If 2 

If 3 

P? 

F^ K 

fMR 

Pf 

P? 

P° 

a cd 

p& 

Ps& 

p° 

A sw 

p° 

a gs 

p° 

a mr 

p° 

a mr 

p& 

Pi] 

Ps 

‘li¬ 

ft 

R&Ri/Rn 

r 

h 

SC2 

SC3 

s 

Si 

Sex; 

Seg 

Smr 

Wi 

w 

w oc 

^i(t) 

X(t) 

y(t) 


noise on ith channel 

noise vector 

probability of () 

average BER 

conditional BER 

average BER at output of SC 

average BER at output of GSC 

average BER at output of two-branch GSC 

average BER at output of three-branch GSC 

average BER in CDC 

average BER in MRC 

average BER in MRC with weight errors 

average BER in SDC 

average BER in OC 

outage probability of CDC 

outage probability of EGC 

outage probability of SC 

outage probability of SDC 

outage probability of GSC 

outage probability of MRC 

outage probability of MRC in weight errors 

outage probability of OC 

power of jth interference source 

power of signal source 

amplitude of the jth interference received on ith branch 
array correlation matrix 

array correlation matrix of signal, interference, and noise only, respectively 
signal amplitude 

signal amplitude received on ith branch 

selection combiner with two branches selected 

selection combiner with three branches selected 

mean signal power identical on all branches 

signal power on ith branch 

signal power at output of OC 

signal power at output of EGC 

signal power at output of MRC 

weight on the ith branch 

weight vector 

weight vector of optimal combiner 
received signal on ith branch 
array signal vector 
combiner output. 
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r mean SNR at a branch 

TgQ mean SNR of EGC 

r CD mean SNR at output of CDC 

Tgc mean SNR at output of SC 

Teg mean SNR at output of GSC 

Tmr mean SNR at output of MRC 

Tj mean SNR at ith branch 

a inverse of r 

channel attenuation and phase on ith branch 
a 0 an arbitrary constant 

\j/jj phase of jth interference received on ith branch 

\|/ r characteristic function of an RV r 

9i(t) signal phase on ith branch 

(|) x cumulant generating function of x 

y SNR 

Y 0 threshold value of SNR 

Y, SNR of 1th branch 

y(l) ordered SNR of 1th branch 

y(l) mean value of y(l) 

Yco SNR of CDC 

Ygs SNR of GSC 

Ysc SNR of SC 

q 0 threshold value of power 

q 0 optimum value of threshold power 

p correlation coefficient 

( 0 . signal power to interference power ratio 

|i 0 threshold value of SIR 

Psc SIR of SC 

p EG SIR of EGC 

Pgw SIR of SDC 

p average signal power to average interference power ratio 

Peg average signal power to average interference power ratio of EGC 

Pmr mean SIR of MRC 

Poc mean SIR of OC 
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